2014年10月21日星期二

esProc Helps Process Structured Text in Java – Grouping and Summarizing

Following problems will arise if you perform conditional filtering on text files in Java alone: 

1. The text file is not a database, so it cannot be accessed by SQL. The code needs to be modified if the expression of grouping and summarizing is changed. Besides, if you want a flexible expression, you have to self-program the dynamic expression parsing and evaluating, resulting in a great amount of programming work.
2. The grouping result produced during traversing will be recorded. If the result is small in size, it can be stored directly in the memory; if the size of the result is too big, an intermediate result will have to be stored as cache files which should be merged later. The process will be quite complicated.

These problems can be solved with ready-made class library by introducing esProc to the programming in Java. Now let’s look at in detail how this will happen.

The text file employee.txt has the employee information. It is required to group by DEPT, count the employees and sum up the total amount of their salary in each group.

The format of text file employee.txtis as follows:
EID   NAME       SURNAME        GENDER  STATE        BIRTHDAY        HIREDATE         DEPT         SALARY
1       Rebecca   Moore      F       California 1974-11-20       2005-03-11       R&D          7000
2       Ashley      Wilson      F       New York 1980-07-19       2008-03-16       Finance    11000
3       Rachel      Johnson   F       New Mexico     1970-12-17       2010-12-01       Sales         9000
4       Emily         Smith        F       Texas        1985-03-07       2006-08-15       HR    7000
5       Ashley      Smith        F       Texas        1975-05-13       2004-07-30       R&D          16000
6       Matthew Johnson   M     California 1984-07-07       2005-07-07       Sales         11000
7       Alexis        Smith        F       Illinois       1972-08-16       2002-08-16       Sales         9000
8       Megan     Wilson      F       California 1979-04-19       1984-04-19       Marketing        11000
9       Victoria    Davis        F       Texas        1983-12-07       2009-12-07       HR    3000
10     Ryan         Johnson   M     Pennsylvania    1976-03-12       2006-03-12       R&D          13000
11     Jacob        Moore      M     Texas        1974-12-16       2004-12-16       Sales         12000
12     Jessica     Davis        F       New York 1980-09-11       2008-09-11       Sales         7000
13     Daniel       Davis        M     Florida      1982-05-14       2010-05-14       Finance    10000
Implementation approach: call esProc script with Java, import and compute the data, then return the result in the form of ResultSet to Java. Because esProc supports dynamic expression parsing and evaluating, it enables Java to process data from the text file as flexibly as SQL does.

For example, you are required to group by DEPT, count the employees and sum up the total amount of their salary in each group. esProc can use an input parameter "groupBy" as the dynamic grouping and summarizing condition, which is shown below:




The value of "groupBy" is DEPT:dept;count(~):count,sum(SALARY):salary. And the code written in esProc is as follows:


A1Define a file object and import the data, with the first row being the title. tab is used as the field separator by default. esProc's IDE can display the imported data visually, as shown in the right part of the above figure.

A2Group and summarize according to specified fields, using macro to realize parsing the expression dynamically. The "groupBy" in this process is an input parameter. In executing, esProc will first compute the expression enclosed by ${…}, then replace ${…} with the computed result acting as the macro string value and interpret and execute the code. The final code to be executed in this example is=A1.groups(DEPT:dept;count(~):count,sum(SALARY):salary).

A3Return the eligible result set to the external program.
You just need to modify the parameter –"groupBy" when grouping fields are changed. For example, you are required to group by DEPT and GENDER, count the employees and sum up the total amount of salary in each group. The value of "groupBy' can be written as DEPT:dept,GENDER:gender;count(~):count,sum(SALARY):salary.

The simple summarizing on all data can be regarded as a special case of grouping and summarizing operation. For example, when counting the number of employees and summating the total amount of salary, the value of parameter "groupBy" can be written as ;count(~):count,sum(SALARY):salary. That the parameter part for grouping is omitted means all data is put into one group. The advantage by doing so is that multiple summarizing results of these data can be computed by traversing them once.

The code of calling this piece of code (which is saved as test.dfx) in Java with esProc JDBC is as follows:

// create a connection usingesProcJDBC
Class.forName("com.esproc.jdbc.InternalDriver");
con= DriverManager.getConnection("jdbc:esproc:local://");
//call the program in esProc (the stored procedure); test is the name of file dfx
com.esproc.jdbc.InternalCStatementst;
st =(com.esproc.jdbc.InternalCStatement)con.prepareCall("call test(?)");
//set the parameters
st.setObject(1,"DEPT:dept,GENDER:gender;count(~):count,sum(SALARY):salary");// the parameters are the dynamic grouping and summarizing fields
//execute the esProc stored procedure
st.execute();
//get the result set
ResultSet set = st.getResultSet();

If the script is simple, the code can be written directly into the program in Java that calls the esProc JDBC. It won’t be necessary to write a special script file (test.dfx):
st=(com. esproc.jdbc.InternalCStatement)con.createStatement();
          ResultSet set=st.executeQuery("=file(\"D:/employee.txt\").cursor@t().groups(DEPT:dept,GENDER:gender;count(~):count,sum(SALARY):salary)");

This piece of code in Java calls a line of code in esProc script directly, that is, get the data from the text file and return the result set to set– the object ofResultSet.
         
If the result set of grouping is still too big to be entirely loaded to the memory, groupx statement will return the grouping result using file cursor. Thus the code written in esProc will be modified like this:

groups function puts the grouping and summarizing result completely in the memory.groupx will write the result into temporary files if the grouping and summarizing result is beyond the boundary of buffer rows, redistribute the memory, and then merge the temporary files. Here the parameter 1000000 refers to buffer rows. The principle of assigning value to it is to make the best of the memory, trying to reduce the number of temporary files as far as possible. The number of temporary files is related to the size of both the physical memory and the record, and should be evaluated during programming. Generally, the recommended number is between magnitudes of several hundred thousand to a magnitude of one million.

Though cell A3 returns a cursor, instead of a result set, to Java, it is no need to modify the calling program of Java. esProc will automatically fetch the data corresponding to the cursor while Java is traversing the data with ResultSet.

This piece of program can be further improved to support filtering before and after the grouping. Now the role of the program is similar to that of where and having in SQL. For example, the statistical object becomes female employees (GENDER=="F"), and it is required to retain only the departments where the number of female employees is greater than ten after grouping and summarizing operation. The code is as follows:

Cellset parameters are made absent here for easy understanding, yet the code is the same as that in the above:A2.groupx(${groupBy}). The parameter of select function can be written as the macro which will be passed from Java program. 

2014年10月20日星期一

esProc Helps Process Structured Text in Java–Conditional Filtering

Following problems will arise if you perform conditional filtering on text files in Java alone: 
1. The text file is not a database,so it cannot be accessed by SQL. The code needs to be modified if filtering conditions are changed. Besides, if you want a flexible conditional filtering as that in SQL, you have to self-program the dynamic expression parsing and evaluating, resulting in a great amount of programming work.
2. Stepwise loading is required for the big files that cannot be loaded into the memory all at once. If the performance must be taken into account, you have to deal with some complicated programming like the management of file buffer and line-splitting computing.

But if esProc is used to help with Java programming, these problems can be solved without self-programmed code. The following example will teach you how to do this in detail.

The text file employee.txt has the employee information. You are required to fetch the data and select from them the female employees who were born on and after January 1, 1981.

The text fileemployee.txtis in a format as follows:
EID   NAME       SURNAME        GENDER  STATE        BIRTHDAY        HIREDATE         DEPT         SALARY
1       Rebecca   Moore      F       California 1974-11-20       2005-03-11       R&D          7000
2       Ashley      Wilson      F       New York 1980-07-19       2008-03-16       Finance    11000
3       Rachel      Johnson   F       New Mexico     1970-12-17       2010-12-01       Sales         9000
4       Emily         Smith        F       Texas        1985-03-07       2006-08-15       HR    7000
5       Ashley      Smith        F       Texas        1975-05-13       2004-07-30       R&D          16000
6       Matthew Johnson   M     California 1984-07-07       2005-07-07       Sales         11000
7       Alexis        Smith        F       Illinois       1972-08-16       2002-08-16       Sales         9000
8       Megan     Wilson      F       California 1979-04-19       1984-04-19       Marketing        11000
9       Victoria    Davis        F       Texas        1983-12-07       2009-12-07       HR    3000
10     Ryan         Johnson   M     Pennsylvania    1976-03-12       2006-03-12       R&D          13000
11     Jacob        Moore      M     Texas        1974-12-16       2004-12-16       Sales         12000
12     Jessica     Davis        F       New York 1980-09-11       2008-09-11       Sales         7000
13     Daniel       Davis        M     Florida      1982-05-14       2010-05-14       Finance    10000
Implementation approach: call esProc script with Java, import and compute the data, then return the result in the form of ResultSet to Java. Because esProc supports dynamic expression parsing and evaluating, it enables Java to perform the conditional filtering as flexibly as SQL does.


For example, it is required to query the information of female employees who were born on and after January 1, 1981. In this case, esProc can use an input parameter “where”as the condition, which is shown below: 

"where" is a string, its values is BIRTHDAY>=date(1981,1,1) && GENDER=="F".

The code written in esProc is as follows:

A1Define a file object and import the data, with the first row being the title. tab is used as the field separator by default. esProc's IDE can display the imported data visually, as shown in the right part of the above figure.

A2Perform the conditional filtering, using macro to realize parsing the expression dynamically. The “where” in this process is an input parameter. In executing, esProc will first compute the expression enclosed by ${…}, then replace ${…} with the computed result acting as the macro string value and interpret and execute the code. The final code to be executed in this example is =A1.select(BIRTHDAY>=date(1981,1,1) && GENDER=="F").

A3Return the eligible result set to the external program.

In esProc, when the filtering condition is changed, you just need to modify "where"– the parameter. For example, it is required to query the information of female employees who were born on and after January 1, 1981, or employees whose NAME+SURNAME is equivalent to "RebeccaMoore". The value of “where” can be written as BIRTHDAY>=date(1981,1,1) && GENDER=="F" || NAME+SURNAME=="RebeccaMoore". After the code is executed, the result set in A2 is as follows:

Call this piece of code in Java with esProc JDBC and get the result. Detailed code is as follows (save the above program in esProc as test.dfx):
          //create a connection withesProc JDBC
Class.forName("com.esproc.jdbc.InternalDriver");
con= DriverManager.getConnection("jdbc:esproc:local://");
//call the program in esProc (the stored procedure); test is the name of file dfx
st =(com.esproc.jdbc.InternalCStatement)con.prepareCall("call test(?)");
//set the parameters
st.setObject(1," BIRTHDAY>=date(1981,1,1) && GENDER==\"F\" ||NAME+SURNAME==\"RebeccaMoore\"");//the parameters are the dynamic filtering conditions
//execute the esProcstored procedure
st.execute();
//get the result set, which is the eligible set of employees
ResultSet set = st.getResultSet();

If the script is simple, the code can be written directly into the program in Java that calls the esProc JDBC. It won't be necessary to write a special script file (test.dfx):
st=(com. esproc.jdbc.InternalCStatement)con.createStatement();
ResultSet set=st.executeQuery("=file(\"D:/employee.txt\").import@t().select(BIRTHDAY>=date(1981,1,1)&&GENDER==\"F\" || NAME+SURNAME==\"RebeccaMoore\")");

This piece of code in Java calls a line of code in esProc script directly, that is, get the data from the text file, perform conditional filteringand return the result set to set– the object of ResultSet.

It is assumed, in the above approach, that the file is small enough to be loaded to the memory all together. In reality, there may be huge files that cannot be loaded all together or the situation where it is believed that it is unnecessary to increase memory usage even if the file is not huge. In these occasions, file cursor can be used to handle the operation, thus the program in esProc can be modified in this way: 
A1Define a file cursor object, with the first row being the title and tab being the field separator by default.

A2Perform conditional filtering on the cursor, using macro to realize parsing the expression dynamically. The “where” in this process is an input parameter. In executing, esProc will first compute the expression enclosed by ${…}, then replace ${…} with the computed result acting as the macro string value and interpret and execute the code. The final code to be executed in this example is =A1.select(BIRTHDAY>=date(1981,1,1) && GENDER=="F").


A3Return the cursor.
Despite a cursor returned to Java by esProc, it is no need to modify the calling program of Java. esProc will automatically fetch the data corresponding to the cursor while Java is traversing the data with ResultSet.

If it is needed to write the filtered data to another file, instead of returning them to the main program, you just modify the expression in A3 into =file("D:/employee_group.txt").export@t(A2). esProc will write out the data of the cursor to a file. 

2014年10月19日星期日

esProc Helps Process Heterogeneous Data Sources in Java – Excel

Generally, Java programmers use poi or other open source packages to read and compute Excel data. These open source packages support low-level programming, which increases the overall learning cost and complicates the operation. But with the help of esProc, Java can avoid these problems.

An example will make the point clearly understood. Read the information of sales orders from the Excel file orders.xls and select sales orders that placed on and after January 1st, 2010 and in which SELLERID is equal to 18. The content of orders.xls is shown as follows:
ORDERID
CLIENT
SELLERID
AMOUNT
ORDERDATE
1
UJRNP
17
392
2008/11/2 15:28
2
SJCH
6
4802
2008/11/9 15:28
3
UJRNP
16
13500
2008/11/5 15:28
4
PWQ
9
26100
2008/11/8 15:28
5
PWQ
11
4410
2008/11/12 15:28
6
HANAR
18
6174
2008/11/7 15:28
7
EGU
2
17800
2008/11/6 15:28
8
VILJX
7
2156
2008/11/9 15:28
9
JAYB
14
17400
2008/11/12 15:28
10
JAXE
19
19200
2008/11/12 15:28
11
SJCH
7
13700
2008/11/10 15:28

Implementation approach: Call esProc script using Java program, read and compute the data in the Excel file, and then return the result in the form of ResultSet to Java program. Since esProc supports analyzing and evaluating expressions dynamically, it will enable Java to process data as flexibly as SQL does.


First, programmers can take the criteria "sales orders that placed on and after January 1st, 2010 and in which SELLERID is equal to 18" as the parameter where and pass it to esProc program. This is shown as follows:
where is a string, its value is ORDERDATE>=date(2010,1,1) && SELLERID==18.   
    
The code for esProc program is:

A1Define a file object and import the Excel data into it. esProc's IDE can display the imported data visually, as shown in the right part of the above figure. Importxls function can access xlsx files too and can automatically identify the version of Excel according to the file name extension.

A2Perform the filtering according to the criteria, using macro to realize parsing the expression dynamically. The “where” in this process is an input parameter. In executing, esProc will first compute the expression surrounded by ${…}, take the computed result as macro string value, replace ${…} with it and then interpret and execute the code. The final code to be executed in this example is=A1.select(ORDERDATE>=date(2010,1,1) && SELLERID==18).

A3Return the eligible result set to Java. If the result needs to be written to another Excel file, just modify the code in cell A3 to =file("D:/file/orders_result.xls").exportxls@t(A2). If the criteria are changed, you just need to modify "where"– the parameter. For example, it is required to select sales orders that placed on and after January 1st, 2010 and in which SELLERID is equal to 18, or Client is equal to PWQ. The value of “where” can be written asCLIENT=="PWQ"||ORDERDATE>=date(2010,1,1) && SELLERID==18. After the code is executed, the result set in A2 is as follows:

Call this piece of code in Java with esProc JDBC and get the result. Detailed code is as follows (save the above program in esProc as test.dfx):
          //create a connection using esProc jdbc
Class.forName("com.esproc.jdbc.InternalDriver");
con= DriverManager.getConnection("jdbc:esproc:local://");
// call the program in esProc (the stored procedure); test is the name of file dfx
com.esproc.jdbc.InternalCStatementst=(com.esproc.jdbc.InternalCStatement)con.prepareCall("call test(?)");
    // set parameters
st.setObject(1,"ORDERDATE>=date(2010,1,1) && SELLERID==18 || CLIENT==\"PWQ\"");    // execute the esProc stored procedure
ResultSet set =st.executeQuery();

If the script is simple, the code can be written directly into the program in Java that calls the esProc JDBC. It won’t be necessary to write a special script file (test.dfx):
esultSet set = st.executeQuery(
"=file(\"D:/file/orders.xls\").importxls@t().select(ORDERDATE>=date(2010,1,1) && SELLERID==18 || CLIENT==\"PWQ\")");

This piece of code in Java calls a line of code in esProc script directly, that is, get the data from the Excel file and filter it according to the criteria. 

2014年10月15日星期三

esProc Helps Process Structured Text in Java–Conditional Filtering

Following problems will arise if you perform conditional filtering on text files in Java alone: 
1. The text file is not a database,soit cannot be accessed by SQL. The code needs to be modified if filtering conditions are changed. Besides, if you want a flexibleconditional filtering as that in SQL, you have to self-program the dynamic expression parsing and evaluating, resulting in a great amount of programming work.
2. Stepwise loading is required for the big files that cannot be loaded into the memory all at once. If the performance must be taken into account, you have to deal with some complicated programming like the management of file buffer and line-splitting computing.

But if esProc is used to help with Java programming, these problems can be solved without self-programmed code. The following example will teach you how to do this in detail.

The text file employee.txt has the employee information. You are required to fetch the data and select from them the female employees who were born on and after January 1, 1981.

The text fileemployee.txtis in a format as follows:
EID   NAME       SURNAME        GENDER  STATE        BIRTHDAY        HIREDATE         DEPT         SALARY
1       Rebecca   Moore      F       California 1974-11-20       2005-03-11       R&D          7000
2       Ashley      Wilson      F       New York 1980-07-19       2008-03-16       Finance    11000
3       Rachel      Johnson   F       New Mexico     1970-12-17       2010-12-01       Sales         9000
4       Emily         Smith        F       Texas        1985-03-07       2006-08-15       HR    7000
5       Ashley      Smith        F       Texas        1975-05-13       2004-07-30       R&D          16000
6       Matthew Johnson   M     California 1984-07-07       2005-07-07       Sales         11000
7       Alexis        Smith        F       Illinois       1972-08-16       2002-08-16       Sales         9000
8       Megan     Wilson      F       California 1979-04-19       1984-04-19       Marketing        11000
9       Victoria    Davis        F       Texas        1983-12-07       2009-12-07       HR    3000
10     Ryan         Johnson   M     Pennsylvania    1976-03-12       2006-03-12       R&D          13000
11     Jacob        Moore      M     Texas        1974-12-16       2004-12-16       Sales         12000
12     Jessica     Davis        F       New York 1980-09-11       2008-09-11       Sales         7000
13     Daniel       Davis        M     Florida      1982-05-14       2010-05-14       Finance    10000
Implementation approach: call esProc script with Java, import and compute the data, then return the result in the form of ResultSet to Java. Because esProc supports dynamic expression parsing and evaluating, it enables Java to perform the conditional filtering as flexibly as SQL does.


For example, it is required to query the information of female employees who were born on and after January 1, 1981. In this case, esProc can use an input parameter "where" as the condition, which is shown below: 

"where" is a string, its values is BIRTHDAY>=date(1981,1,1) && GENDER=="F".

The code written in esProc is as follows:

A1Define a file object and import the data, with the first row being the title. tab is used as the field separator by default. esProc's IDE can display the imported data visually, as shown in the right part of the above figure.

A2Perform the conditional filtering, using macro to realize parsing the expression dynamically. The "where" in this process is an input parameter. In executing, esProc will first compute the expression enclosed by ${…}, then replace ${…} with the computed result acting as the macro string value and interpret and execute the code. The final code to be executed in this example is =A1.select(BIRTHDAY>=date(1981,1,1) && GENDER=="F").

A3Return the eligible result set to the external program.
In esProc, when the filtering condition is changed, you just need to modify "where"– the parameter. For example, it is required to query the information of female employees who were born on and after January 1, 1981, or employees whose NAME+SURNAME is equivalent to "RebeccaMoore". The value of “where” can be written as BIRTHDAY>=date(1981,1,1) && GENDER=="F" || NAME+SURNAME=="RebeccaMoore". After the code is executed, the result set in A2 is as follows:

Call this piece of code in Java with esProc JDBC and get the result. Detailed code is as follows (save the above program in esProc as test.dfx):
          //create a connection withesProc JDBC
Class.forName("com.esproc.jdbc.InternalDriver");
con= DriverManager.getConnection("jdbc:esproc:local://");
//call the program in esProc (the stored procedure); test is the name of file dfx
st =(com.esproc.jdbc.InternalCStatement)con.prepareCall("call test(?)");
//set the parameters
st.setObject(1," BIRTHDAY>=date(1981,1,1) && GENDER==\"F\" ||NAME+SURNAME==\"RebeccaMoore\"");//the parameters are the dynamic filtering conditions
//execute the esProcstored procedure
st.execute();
//get the result set, which is the eligible set of employees
ResultSet set = st.getResultSet();
If the script is simple, the code can be written directly into the program in Java that calls the esProc JDBC. It won’t benecessary to write a special script file (test.dfx):
st=(com. esproc.jdbc.InternalCStatement)con.createStatement();
ResultSet set=st.executeQuery("=file(\"D:/employee.txt\").import@t().select(BIRTHDAY>=date(1981,1,1)&&GENDER==\"F\" || NAME+SURNAME==\"RebeccaMoore\")");
This piece of code in Java calls a line of code in esProc scriptdirectly, that is, getthe data from the text file, perform conditional filteringand return the result set to set– the object ofResultSet.

It is assumed, in the above approach, that the file is small enough to be loaded to the memory all together. In reality, there may be huge files that cannot be loaded all together or the situation where it is believed that it is unnecessary to increase memory usage even if the file is not huge. In these occasions, file cursor can be used to handle the operation, thus the program in esProc can be modified in this way:

A1Define a file cursor object, with the first row being the title and tab being the field separator by default.

A2Perform conditional filtering on the cursor, using macro to realize parsing the expression dynamically. The “where” in this process is an input parameter. In executing, esProc will first compute the expression enclosed by ${…}, then replace ${…} with the computed result acting as the macro string value and interpret and execute the code. The final code to be executed in this example is =A1.select(BIRTHDAY>=date(1981,1,1) && GENDER=="F").

A3Return the cursor.

Despite a cursor returned to Java by esProc, it is no need to modify the calling program of Java. esProc will automatically fetch the data corresponding to the cursor while Java is traversing the data with ResultSet.

If it is needed to write the filtered data to another file, instead of returning them to the main program, you just modify the expression in A3 into =file("D:/employee_group.txt").export@t(A2). esProc will write out the data of the cursor to a file. 

2014年10月14日星期二

esProc Helps Process Heterogeneous Data sources in Java - Hive

It is easy for Java to connect to Hive using JDBC. But the computational ability of Hive is less than that of SQL in other databases. So to deal with uncommon computations, data should be retrieved before further operation is performed using Java. Thus the code for will be complicated.

But if esProc is used to help with the Java programming, the complicated operation as a result of using Hive in Java will become simpler. The following example will show how esProc works with Java in detail. orders is a table in Hive containing the detailed data of sales orders. Now it is required to compute the year-on-year comparison and link relative ratio. The data is as follows:

ORDERID CLIENT     SELLERID AMOUNT ORDERDATE
1       UJRNP      17     392  2008/11/2 15:28
2       SJCH         6       4802         2008/11/9 15:28
3       UJRNP      16     13500       2008/11/5 15:28
4       PWQ         9       26100       2008/11/8 15:28
5       PWQ         11     4410         2008/11/12 15:28
6       HANAR     18     6174         2008/11/7 15:28
7       EGU 2       17800       2008/11/6 15:28
8       VILJX         7       2156         2008/11/9 15:28
9       JAYB          14     17400       2008/11/12 15:28
10     JAXE          19     19200       2008/11/12 15:28
11     SJCH         7       13700       2008/11/10 15:28
12     QUICK      11     21200       2008/11/13 15:28
13     HL    12     21400       2008/11/21 15:28
14     JAYB          1       7644         2008/11/16 15:28
15     MIP  16     3234         2008/11/19 15:28


Link relative ratio refers to comparison between the current data and data of the previous period, using month as the time interval. For example, divide the sales figure in April by that in March and we get the link relative ratio of April. Year-on-year comparison is the comparison between the current data and data of the corresponding period of the previous year, which means, for example, dividing the sales figure of April 2014 by that of April 2013. Since Hive provides no window functions, it cannot complete the computation unless using nested SQL. But Hive supports very poor subquery and usually the computation should be performed outside of the database. With esProc, however, the computation can be realized easily. The code is as follows:
A1Connect to the database through JDBC using the datasource Hive defined in advance.

A2Query the data in the database by the time period using external parameters begin and end. Such as begin="2011-01-01 00:00:00", end="2014-07-08 00:00:00" (i.e. the current date which can be obtained using now() function).
A3:Group orders by the year and the month and sum up to get the sales of each month.

A4:Add a new field Irr, which is the monthly link relative ratio. The expression is mAmount/mAmount[-1], in whichmAmount represents the sales in the current time period and mAmount[-1] represents the sales in the previous one. Note that the link relative ratio of the initial month (January of 2011) is empty.

A5:Sort the data in A4 by the month and the year before we compute the year-on-year comparison. Complete code should be =A4.sort(m,y). But since A4 has been sorted by the year, here we just need to sort it by the month, that is A4.sort(m), which has a better performance.

A6:Add another new field yoy, which is the year-on-year comparison of the monthly sales figure. The expression is if(m==m[-1],mAmount/mAmount[-1],null), which means the year-on-year comparison is valid only between the same months of the two time periods. The year-on-year comparison of each month in the initial year (the year of 2011) is empty.

A7Sort the data in A6 by the year in descending order and by the month in ascending order. Note that the data is valid up to July of 2014. The result is as follows:
A8Close Hive database connection.

A9Return the result.

This block of code can be called by Java using esProc JDBC to get the final result (the above esProc program will be saved as test.dfx). The code for this is as follows:
          // create a connection usingesProcjdbc
Class.forName("com.esproc.jdbc.InternalDriver");
con= DriverManager.getConnection("jdbc:esproc:local://");
// call the esProc program (the stored procedure); test is the file name of dfx
st =(com.esproc.jdbc.InternalCStatement)con.prepareCall("call test(?,?)");
// set the parameters
st.setObject(1,"2011-01-01 00:00:00");//begin
st.setObject(1,"2014-07-08 00:00:00");//end
// execute esProc stored procedure
st.execute();
// get the result set
ResultSet set = st.getResultSet();

It is the same way in which esProc accesses Hive and other ordinary databases. Just configure their JDBC while detailed process is omitted here.