Description of the Issue
Some reports in a project suffered from
very low speed. Despite various iReport and Oracle database optimizations,
the situation is not yet satisfying. For example, there is a detail report, involving
large data volume, many (dozens of) data tables, and frequent inter-table join
(including self join). This report includes inter-cell computing expressions
(ratios and sum).
Here are some complicated data set SQL statements from this report:
(select *
from (select syb.org_abbn as syb,
max(xmb.org_abbn) as xmb,
sub.org_subjection_id as sub_id,
oi.org_abbn as org_abb,
rm.rec_notice_org_id,
rm.synergic_team as xz_team,
xzdw.coding_name as xz_org,
l.requisition_cd as req_cd,
l.requisition_id as req_id,
l.note as req_note,
nvl(decode(l.ops_content6,
2000200012,
'Yes',
2000200011,
'No'),
'') as sflj,
--too
long, most part from the select clause is omitted.
fromlcr l
left join lcrrm on rm.requisition_id =
l.master_bill_id
andrm.table_type = '0'
andnvl(rm.bsflag, 0) != 1
left join cos sub on l.org_id = sub.org_id
andnvl(sub.bsflag, 0) != 1
left join coioi on oi.org_id = sub.org_id
andnvl(oi.bsflag, 0) != 1
--too
long, most part from the join is omitted.
wherel.table_type = '1'
andl.requisition_state = '0101020304'
andnvl(l.bsflag, 0) != 1
andto_char(l.back_date,
'yyyy-MM-dd') between '2012-01-01' and
'2012-04-25'
group by l.requisition_id,
l.note,
l.requisition_type,
sub.org_subjection_id,
syb.org_abbreviation,
rm.rec_notice_org_id,
oi.org_abbreviation,
--too
long, most of the group by fields are omitted
) a-- main query a
LEFT JOIN crviewve-- viewve
ON ve.requisition_id = a.req_id
If you check these SQL statements
carefully, you’ll find immediately that there are too many tables associated,
including a lot of self-join. Meanwhile, there are many sub query embedded in
it. To make this worse, it is also associated with a view, which is very
complicated.
Currently the data presentation time for
this report, when querying against 4 months data volume, is 6 minutes 42 seconds. This is far from what the end-user
could accept.
As mentioned before, the report has been
optimized several times. The data set SQL and report expressions have gone through
careful tuning process. The above data set SQL is very complicated, with no
room for further optimization. Meanwhile, as real time query, the use of
pre-computed intermediate table for acceleration is also not a feasible
approach.
After analyzing the report we find that it involves
two stages: 1) the data loading stage (data set SQL execution stage), and 2) report computation and presentation stage. The first stage requires 5 minutes, and the
second stage requires more than 1 minute. The reason for the slowing running of
data set SQL is caused by the extremely low efficiency of the join in two sub
queries (main query a and view ve).
Thus we find a new approach for
optimization: we’ll mainly optimize the data set loading by improving the efficiency
of SQL join. At the same time, we’ll optimize the computation and presentation
part.
Resolution Process
The esProc approach for resolution of this
issue is as following:
1. Split the data set SQL of the report
As previously mentioned, the join between the
two sub queries is causing the slow running of the SQL. We use esProc to
execute the SQL for two sub queries, and then complete the association in
esProc with “switch” (“switch” or “join” is used accordingly) statement. After
test run we find significant improvement on efficiency.
2. Eliminate inter-cell computing from the
report
The inter-cell computing (ratios and sum)
part in the original report template is moved into esProc, thus the report
generation could be speed up due to the removal of grid scanning.
3. Return the result set to the report all
together
After all data preparation is done through
esProc, the result will be returned to reporting tool all together. Once data
source is received, the presentation will be done directly, without any computation
(such as inter-cell computing) that might affect efficiency.
The complete codes for esProc are as
following:
Solution Result
Through the above process, total report
presentation time is radically reduced from the original 6 minutes 42 seconds
to 57seconds - less than 1 minute. The benefit
of this optimization is remarkable. This is what the end-user is happy to see.
Conclusion
In the process of the problem resolution,
we found that the main query a and
view ve in the original SQL statement
requires only 10 to 40 seconds when executed in Oracle separately. However, a
join between a and view ve requires several minutes. This is
because Oracle cannot always find a reasonable approach when automatic
execution plan is used. If human interference is required, it will be very tedious
and time consuming.
esProc
could improve the performance, because we know that ve is actually a dimensional
table of a. Thus we can use a particular method of “switch”. This allows human
definition of the execution plan for complicated query. In combination with Oracle’s
basic query statement, it will speed up the process significantly.
About esProc: http://www.raqsoft.com/product-esproc
没有评论:
发表评论