In the above computation, we have used the keyword top. In fact there
is not such a thing (it can be combined out by other computation computations)
in the theory of relational algebra, and this is not the standard writing style
of SQL.
Let us see how difficult it is to look for the top 10 places when
there is no top.
Rough working thought: Seek out the number of members whose sales
amount are higher than itself to rank the sales person, and then get the
members whose places do not exceed 10, and the SQL is written as follows:
select
sales
from
( select A.sales sales, A.product product,
(select
count(*)+1 from sales_amount
where
A.product=product AND A.amount<=amount) ranking
from sales_amount A )
where
product='AC' AND ranking<=10
or
select
sales
from
( select A.sales sales, A.product product, count(*)+1 ranking
from sales_amount A,
sales_amount B
where
A.sales=B.sales and A.product=B.product AND A.amount<=B.amount
group
by A.sales,A.product )
where product='AC' AND
ranking<=10
Professional technical personnel may not necessarily write such SQL
statement well! And only the first ten places are computed.
To say the least, even if there is top, it only makes it easy
to get the preceding part lightly. If we change the problem into getting the 6th
place to the 10th place, or seeking the salesman whose sales amount
is 10% more than that of the next one, the difficulty is still there.
The reason causing this phenomenon lies in the third important disadvantage
of SQL: Lack the support of ordered set.
SQL inherits the unordered set in mathematics, which directly causes the fact
that the computations relating to sequence are rather difficult. And it can be
imagined how common the computations relating to sequence (such as over the
preceding month, over the same period last year, the first 20%, and rankings)
will be.
The newly added window functions in SQL2003 standard provides some
computation capabilities relating to sequence, which makes it possible to solve
some problems in a relatively simple method and alleviate the problem of SQL to
a certain extent. But the use of window functions is often accompanied by
sub-query, and it cannot enable user to directly use the sequence number to
access set member, so there are still many ordered computations that are
difficult to solve.
Now we want to pay attention to the gender proportion of the “good”
salespersons that are computed out, that is, how many males and females there
are respectively. Generally, the gender information is recorded in the employee
table but not in the performance table, and it is simplified as follows:
employee
|
Employees table
|
name
|
Names of employees, suppose there is no
repeated name.
|
gender
|
Genders of employees.
|
We have already computed out the name list of “good” salespersons,
and the relatively natural idea is to seek out their genders from the employee
table using name list, and count the number. But in SQL, it is necessary to use
join operation to get information across tables . In this way, following the
initial result, SQL will be written as:
select
employee.gender,count(*)
from
employee,
(
( select top 10 sales from sales_amount where product='AC' order by amount desc
)
intersect
(
select top 10 sales from sales_amount where product='TV' order by amount desc )
) A
where
A.sales=employee.name
group
by employee.gender
With only an associated table more, it is made so over-elaborate and
in reality there are rather more cross-table storages and they are often
multi-layered. For example, for salespersons, there are departments where there
are managers, and now we want to know by which managers these “good”
salespersons are managed. Then there are three table joins, and it is indeed no
easy job to write clear where and group in this
computation.
This is just the fourth important disadvantage of SQL as we want to
say: Lack of object reference, in
relational algebra, the relations between objects completely depends on foreign
key. This not only makes the efficiency very low in looking for relation, but
also makes it impossible to directly treat the record pointed by foreign key as
the attribute of primary record . Try thinking, can the above statement be
written as this:
select
sales.gender,count(*)
from (…) // …is the SQL
computing the “good” salespersons above
group by sales.gender
Evidently, this statement is not only clearer, and at the same time,
the computation will also be more efficient (without join computation).
We have analyzed, through a simple example, the four important
difficulties of SQL. We believe this is just the main reason why SQL fails to
reach the original intention of its invention. The process of solving business
problem based on a kind of computation system is in fact the process of translating business problems into
formalized computation syntax (similar to the case in which a pupil solves
application problem, translates the problem into formalized four arithmetic
operations). Before overcoming these difficulties, SQL model system rather does
not comply with people’s natural thinking habit, causing great barriers in
translating problems, making it very difficult for SQL to be applied, on a
large scale, in data computation for business problems.
For still another example which is easily understood by programmer,
use SQL as data computation, which is similar to the case in which assembly
language is used to complete four arithmetic operations. We very easily write
out the calculation expression such as 3+5*7, but to use assembly
language (take X86 as the example), it needs to be written as
mov
ax,3
mov bx,5
mul
bx,7
add
ax,bx
In either writing or
reading, such code is far inferior to 3+5*7 (it will be more
troublesome if we come across decimal). Though it cannot be regarded as a big
problem to a skilled programmer, to most people, however, this kind of writing
is too hard to understand. In this sense, FORTRAN is really a
great invention.
没有评论:
发表评论