July 20, 2010 at 12:57 am
Hi friends,
In the book Inside Microsoft SQL Server™ 2005 T-SQL Querying, the author Itzik Ben-Gan tell us the order or SQL Server 2005's logical query processing is:
1. FROM
2. ON
3. OUTER (join)
4. WHERE
5. GROUP BY
6. CUBE | ROLLUP
7. HAVING
8. SELECT
9. DISTINCT
10. ORDER BY <---------------------- NOTE
11. TOP <---------------------- NOTE
In his book Inside Microsoft SQL Server 2008 : T-SQL Querying, he tell us the following logical query processing order:
(1) FROM
(1-J1) Cartesian Product
(1-J2) ON Filter
(1-J3) Add Outer Rows
(2) WHERE
(3) GROUP BY
(4) HAVING
(5) SELECT
(5-1) Evaluate Expressions
(5-2) DISTINCT
(5-3) TOP <---------------------- NOTE
(6) ORDER BY <---------------------- NOTE
Note the order of TOP and ORDER BY in the upper excerpts from these books. They are just opposite. I think these two step is very important, and will give a totally different result with different order. I'm confused about this so I wrote a letter to the author, the following is the response from the author. After reading it I'm still unclear. Anyone can help me with it? Great thanks.
Response from the author:
Hi,
That’s a good observation. Indeed I changed my view of how TOP fits in the logical query processing model between the editions of the books. But it’s important to note what I also mentioned in the books—that logical query processing is a conceptual model, and the database engine does not need to follow it as long as it can guarantee the same result as defined by logical query processing. With this in mind, there’s no real relevance to the storage engine here.
At any rate, in the 2005 edition of the book I had in mind conceptually an ordering step followed by a TOP filtering step. In 2008, I changed the way I view things. If you think about it, conceptually there are two separate phases involved that just happen to rely on the same ORDER BY clause. First, there’s the TOP filtering phase that as part of its specification has a number (or percent) of rows to be filtered, and logical ordering that defines which rows to filter. Then there’s the presentation ordering phase that arranges the rows in the output in the requested order (that’s if the query is not used in a table expression like a view).
If you think about it, in both cases you get the same results; it’s just two different conceptual ways to look at things. And since logical query processing is a conceptual model that just describes conceptually how the correct output is achieved, both are valid. Anyway the engine can do what it likes to get to the correct result—that’s what optimization is all about.
Hope this explains things.
Regards,
Itzik
Viewing post 1 (of 1 total)
You must be logged in to reply to this topic. Login to reply