In the real business, quite a few report data is from file instead of database. Take the below case for example: To compute over the attendance data files for preparing the salary payable report, the log file for preparing the user behavior analysis report, and the stock trading record file for preparing the stock analysis report. Because the report usually comes with the parameter, people often need to group, filter, and perform other similar operations over the row data. To serve this purpose, the file should support the re-computation which is unfortunately just what the file lack. Nowadays, these two practices are generally adopted:
Import data to database
Since the file itself is not powerful enough in computing, it is natural for people to consider storing the data in the database, with an aim to leverage the computing capability of database to prepare data for report. As is commonly believed, regarding the solution to get the data source tables ready for most files, we cannot care more about the cost problem.
The expenditure usually comprises the costs in these 3 aspects: time, resources, and administrations.
Time cost. After all, programming does take time (even the load action requires scripting). Moreover, the speed of inserting data into the database is not always satisfactory - more time-consuming for more data volume.
Resource cost. Importing data to the database will definitely take up the expensive database space, and incur the extra database purchase cost.
Administration cost. The flat storage mechanism for database increases the difficult of administration when there are massive amount of data tables, along with the higher pressure and the administration and maintenance costs.
Process file using the custom data source of reporting tool
The typical reporting tools all provide the custom data source function. Programmers can thus code by themselves to implement the data source. With this advantage, the high level languages like JAVA and C# can be used to implement the data computation function for the specific files.
The major advantage of this practice is its flexibility. Theoretically, all computations can be completed by programming. But such programming is never an easy thing. Java and other alike senior languages lack the effective support for operations on set. Without the corresponding class library, a simple summary could require many (looping) codes, not to mention the grouping, summarizing, and other operations after associating.
Comparatively, the free edition product of esProc is a more convenient solution to provide the custom data source for the reporting tools.
How esProc implement report with file as data source
Let’s illustrate it with an example:
The data file is the stock price information for each month (a file for a month, and the file is name as “stock_record_yyyyMM”, for example: stock_record_200901. txt, stock_record_200902. txt…), including the stock code, trading date, and closing price for each stock. We need to query out the respective longest consecutive rising days for each stock in a certain period, along with the date and the closing price on which the longest consecutive days have reached.
The report format is shown below:
Stock Code (code of stock)
TradingDate (trading date)
Closing Price (closing price)
BallooningNum (maximum rise in number of days)
The “begin” is the query starting month, and the “end” is the closing month. They are sent by report.
- In A1, with the parameters for the inputted starting and ending months, compute the involved month sequence, that is, the used file
- In A2, with import, loop and retrieve all file data required for this period
- In A3, with derive, add field ballooningNum to the TSeq
- In A4 and A5, after sorting by the stock code and trading date, group by the stock code
- In A6, by looping, judge if the closing price is higher than that of the previous day, and compute the consecutive rising days for each stock on every trading day
- In A7, with pmax, select the records holding the longest consecutive rising days of each stock, including the date on which the longest consecutive rising days is reached and the closing price. Then, sort by ballooningNum in descending order.
- Lastly, in A8, export the result set for use in report with the function result
Report calling esProc script to complete the report presentation
The esProc is packaged into standard JDBC, allowing the reporting tool for invoking the esProc script in the same way as calling stored procedures. For example, in this example reporting tool, firstly, configure the esProc JDBC just in an alike way to configure the database connection; secondly, establish the dataset for stored procedure, and then call stockRecord(? ,?) (In which the “?” is the parameter, representing the starting/ending years or months) to complete the invoking.
Let’s take BIRT as an example to demonstrate the calling procedure:
1.Copy the esProc JDBC driver package to the corresponding directory
2.Configure the report data source
3.Set DataSet and call the esProc script
Then, the whole calling procedure can be completed in a few simple steps. BIRT can use the esProc computing result to directly present the output.