Query Plan Execution Order

Question

Post reply

Query Plan Execution Order

Viewing 15 posts - 46 through 60 (of 103 total)

You must be logged in to reply to this topic. Login to reply

CoolCodeShare SSCrazy Points: 2585 More actions · Answer 1

CoolCodeShare

SSCrazy

Points: 2585

November 2, 2011 at 4:15 am

#1402045

Another V Good QoTD from Kiwi!!!

Rob Ashton Right there with Babe Points: 785 More actions · Answer 2

Have to agree with others, this was a great QotD. I'm happy to say I got it wrong, but I would argue that I was one of those who got it wrong from misunderstanding the term used, "executes".

My understanding of how data is pulled by the root operator from its child operator(s) was correct, I simply thought of operator initialisation and execution as two distinct phases.

Good stuff, hope there's more to come!

Sergiy SSC Guru Points: 110209 More actions · Answer 3

Simple question:

what's being aggregated if the data in the table have not been scanned yet?

_____________
Code for TallyGenerator

Hugo Kornelis SSC Guru Points: 64790 More actions · Answer 4

Sergiy (1/12/2012)
Simple question:
what's being aggregated if the data in the table have not been scanned yet?

Most operators don't process all data at once; they are called repeatedly with the request to return a row. Leaving out the "initialize yourself" calls and focusing on the "give me a row" calls, here is the sequence of operators calling each other for this specific plan.

1. Stream Aggregate is called with a "give me a row" request.

2. Stream Aggregate doesn't have any data yet, so it calls the Nested Loops operator to get a row.

3. Nested Loops has no data yet on either input, so it calls the Table Scan operator to get a row.

4. Table Scan reads first row and returns is to Nested Loops.

5. Nested Loops now calls Index Seek to get a row for its second input.

6. Index Seek reads first index row and returns it to Nested Loops.

7. Nested Loops combines inputs and returns row to Stream Aggregate.

8. Stream Aggregate increases internal results (row counter), then calls Nested Loops again for the next row.

9. Steps 5-8 are repeated until the Index Seek returns a "no more matching rows" condition.

10. Steps 3-9 are repeated until the Table Scan returns a "no more matching rows" condition.

11. At this point, Nested Loops returns "no more matching rows" to its caller.

12. Stream Aggregate receives "no more matching rows" and returns the internal results (row counter) to the caller.

If there had been a GROUP BY, this all would have been a bit more complex - in that case, Stream Aggregate would return a row and reset its internal counter after each change in the GROUP BY values.

So the first operator to execute is the Stream Aggregate (step 1), but the first operator to actually touch any real data is the Table Scan (step 4).

Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
Visit my SQL Server blog: https://sqlserverfast.com/blog/
SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

Sergiy SSC Guru Points: 110209 More actions · Answer 5

Hugo Kornelis (1/12/2012)
Sergiy (1/12/2012)
Simple question:
what's being aggregated if the data in the table have not been scanned yet?
Most operators don't process all data at once; they are called repeatedly with the request to return a row. Leaving out the "initialize yourself" calls and focusing on the "give me a row" calls, here is the sequence of operators calling each other for this specific plan.
1. Stream Aggregate is called with a "give me a row" request.
2. Stream Aggregate doesn't have any data yet, so it calls the Nested Loops operator to get a row.
3. Nested Loops has no data yet on either input, so it calls the Table Scan operator to get a row.
4. Table Scan reads first row and returns is to Nested Loops.
5. Nested Loops now calls Index Seek to get a row for its second input.
6. Index Seek reads first index row and returns it to Nested Loops.
7. Nested Loops combines inputs and returns row to Stream Aggregate.
8. Stream Aggregate increases internal results (row counter), then calls Nested Loops again for the next row.
9. Steps 5-8 are repeated until the Index Seek returns a "no more matching rows" condition.
10. Steps 3-9 are repeated until the Table Scan returns a "no more matching rows" condition.
11. At this point, Nested Loops returns "no more matching rows" to its caller.
12. Stream Aggregate receives "no more matching rows" and returns the internal results (row counter) to the caller.
If there had been a GROUP BY, this all would have been a bit more complex - in that case, Stream Aggregate would return a row and reset its internal counter after each change in the GROUP BY values.
So the first operator to execute is the Stream Aggregate (step 1), but the first operator to actually touch any real data is the Table Scan (step 4).

From your explanation it's obvious that Stream Aggregate is actually first time executed only on step 8, AFTER table scan, index seek and nexted loop are completed.

Which makes it the last in the queue, not the first.

_____________
Code for TallyGenerator

Sergiy SSC Guru Points: 110209 More actions · Answer 6

Hugo Kornelis (1/12/2012)
So the first operator to execute is the Stream Aggregate (step 1), but the first operator to actually touch any real data is the Table Scan (step 4).

As you said - on step 1 the Stream Aggregate does not have any data to work with (see item 2. of your explanation).

Therefore it physically cannot be executed . It just does not have anything to work with.

Most people miss the very important thing about running SQL statement.

There are several stages of such run:

- parsing;

- compiling;

- execution.

What you described in item 1 actually happens on compilation stage.

The Stream aggregate is compiled first, placed into memory and prepared to accept data from other modules of the program (a query is a small program, right?).

It's the optimiser and compiler wjich are executed on this stage, not Stream Aggregate.

But if we are talking about execution, then data collection comes first.

Data must be collected and filtered by FROM, INNER JOIN and WHERE operators to let any other one to work with the collection.

Isn't it obvious?

_____________
Code for TallyGenerator

Hugo Kornelis SSC Guru Points: 64790 More actions · Answer 7

I think the problem is lingual.

"Is executed" can have two meanings.

1. Present tense, passive form - the code is invoked. Can be compared to running "EXEC MyStoredProc" in SQL Server. As soon as you run that command, execution passes to the first statement of your stored procedure - your stored procedure "is executed".

2. Present tense with perfect aspect - execution has finished. When the execution of your stored procedure has finished and control is returned to the caller, the stored procedure "is executed" (though I think most English speakers would use "has executed" here).

Paul's intention, and my interpretation, is the first meaning: execution starts. And with that meaning, Stream Aggregate is the correct answer, This is the first operator that gets control, initializes itself, then passes control to (by calling) the next operator.

Another way to explain this is to remember that operators do not "push" rows up to their parent operator; operators "pull" rows from their descendants. (Or, as Rob Farley worded it, "execution plans suck").

Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
Visit my SQL Server blog: https://sqlserverfast.com/blog/
SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

Sergiy SSC Guru Points: 110209 More actions · Answer 8

Hugo Kornelis (1/12/2012)
I think the problem is lingual.
"Is executed" can have two meanings.
1. Present tense, passive form - the code is invoked. Can be compared to running "EXEC MyStoredProc" in SQL Server. As soon as you run that command, execution passes to the first statement of your stored procedure - your stored procedure "is executed".
2. Present tense with perfect aspect - execution has finished. When the execution of your stored procedure has finished and control is returned to the caller, the stored procedure "is executed" (though I think most English speakers would use "has executed" here).
Paul's intention, and my interpretation, is the first meaning: execution starts. And with that meaning, Stream Aggregate is the correct answer, This is the first operator that gets control, initializes itself, then passes control to (by calling) the next operator.
Another way to explain this is to remember that operators do not "push" rows up to their parent operator; operators "pull" rows from their descendants. (Or, as Rob Farley worded it, "execution plans suck").

I don't think it's lingual.

The problem is in misunderstanding how SQL works.

As soon as you run that command, execution passes to the first statement of your stored procedure - your stored procedure "is executed".

Wrong!

As soon as you run this command the procedure (query) is parsed.

After parsing is completed optimiser looks for an execution plan (compiled code) for the SQL batch.

If it cannot be found then goes compilation.

Compilation figures out the future execution plan, and it makes it from end to the beginning.

And only if compiler can find all the objects, allocate memory for all variables, etc., and actually compile the code then comes the execution.

On compilation stage it's right, Stream Aggregation has to be compiled first to make an "output pipe" for other operations which are gonna to supply the data for the aggregation.

But on run time (it's when the statement is executed) Stream Aggregation comes last, as all others must be executed first to supply data for the aggregation.

_____________
Code for TallyGenerator

Hugo Kornelis SSC Guru Points: 64790 More actions · Answer 9

Sergiy (1/12/2012)
Wrong!
As soon as you run this command the procedure (query) is parsed.
After parsing is completed optimiser looks for an execution plan (compiled code) for the SQL batch.
If it cannot be found then goes compilation.
Compilation figures out the future execution plan, and it makes it from end to the beginning.
And only if compiler can find all the objects, allocate memory for all variables, etc., and actually compile the code then comes the execution.
On compilation stage it's right, Stream Aggregation has to be compiled first to make an "output pipe" for other operations which are gonna to supply the data for the aggregation.
But on run time (it's when the statement is executed) Stream Aggregation comes last, as all others must be executed first to supply data for the aggregation.

You are right about all the preparation steps. You are wrong about how the compiled plan is executed.

Every operator is a routine (or object) that can be invoked to execute one of the following methods: Init() (to initialize), GetNext() (to get the next row from the result of the operator), and Close() (to clean up and shut down).

Once all the preparation is done and SQL Server is ready to actually execute the query, it will simply call the first (top-left) operator of the plan. The first call will be to the Init() method, then the GetNext() method is called until no more rows are returned, and then Close() is called and execution of the query is complete.

All other operators are invoked directly or indirectly from the top-left operator.

See http://msdn.microsoft.com/en-us/library/ms191158.aspx for a description of how operators work.

Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
Visit my SQL Server blog: https://sqlserverfast.com/blog/
SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

Sergiy SSC Guru Points: 110209 More actions · Answer 10

And here we come to the order of SELECT statements execution.

1. FROM

2. INNER JOIN

3. WHERE

...

SELECT comes somewhere 7th or 8th in line.

After that comes GROUP BY.

And that's where the aggregation is sitting.

You cannot argue, that outcome of aggregation depends on the definition of GROUP BY, can you?

Table scan happens in FROM (because there is no WHERE).

Which is the first one in line.

_____________
Code for TallyGenerator

Sergiy SSC Guru Points: 110209 More actions · Answer 11

Let's modify the query from the QoD a litle bit:

DECLARE @A TABLE (a varchar(20) NOT NULL)

DECLARE @B TABLE (b varchar(20) NOT NULL UNIQUE)

INSERT @A VALUES (1)

INSERT @A VALUES ('a')

INSERT @B VALUES (1)

SELECT AVG(CONVERT(int, a))

FROM @A A

--JOIN @B B ON B.b = A.a

If you run it like this you get an error:

Syntax error converting the varchar value 'a' to a column of data type int.

But if you uncomment the JOIN line it successfully completes the operation.

Why?

The aggregate operation works on a different data set.

Which is provided by previously executed FROM and JOIN.

_____________
Code for TallyGenerator

Paul White SSC Guru Points: 150468 More actions · Answer 12

Sergiy,

I can assure you it works very much like Hugo has described; I'm afraid most of what you have asserted is incorrect. The explanation I gave for the answer included this stack trace:

sqlservr!CQScanTableScanNew::GetRow

sqlservr!CQScanNLJoinTrivialNew::GetRow

sqlservr!CQScanCountStarNew::GetRowHelper

sqlservr!CQScanStreamAggregateNew::GetCalculatedRow

sqlservr!CQScanStreamAggregateNew::Open <-- ****

sqlservr!CXStmtQuery::SetupQueryScanAndExpression

sqlservr!CXStmtQuery::InitForExecute

Notice the Stream Aggregate is the deepest operator call on the stack, so it had control before the join or table scan. I have stepped through the modules many, many times using the debugger, so I really can assure you the aggregate code starts to run first. Other references you may have missed, but were provided in the explanation:

http://blogs.msdn.com/b/craigfr/archive/2006/06/08/622315.aspx

http://bradsruminations.blogspot.com/2010/11/second-in-life-of-query-operator.html http://sqlblog.com/blogs/paul_white/archive/2010/08/05/iterators-query-plans-and-why-they-run-backwards.aspx

You can find all sorts of SQL Server internals information on my blog (as in the last link there).

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi

Paul White SSC Guru Points: 150468 More actions · Answer 13

Sergiy (1/12/2012)
Let's modify the query from the QoD a litle bit...
If you run it like this you get an error:
Syntax error converting the varchar value 'a' to a column of data type int.
But if you uncomment the JOIN line it successfully completes the operation.
Why? The aggregate operation works on a different data set. Which is provided by previously executed FROM and JOIN.

The key to understanding this is that Compute Scalar operators define expressions (in this case, the interesting one defines [Expr1004] = Scalar Operator(CONVERT(int,@A.[a] as [A].[a],0))). As a performance optimization, the execution engine (more specifically the Expression Services component) generally defers evaluation of the expression until a higher operator in the tree (one that is lower on the call stack) actually needs the value to do something with it.

With the JOIN commented out, [Expr1004] is defined and then immediately required by the Stream Aggregate to compute these expressions: [Expr1005] = Scalar Operator(COUNT_BIG([Expr1004])), [Expr1006] = Scalar Operator(SUM([Expr1004])). Notice that both [Expr1005] and [Expr1006] reference [Expr1004], and causing [Expr1004] to be evaluated, generating the error message.

With the JOIN in place, the row is eliminated by the join (no matching row in table B) before the expression is evaluated by the aggregate, so no error occurs. To get the error with the JOIN in place, simply add a row to table B:

DECLARE @A TABLE (a varchar(20) NOT NULL)

DECLARE @B TABLE (b varchar(20) NOT NULL UNIQUE)

INSERT @A VALUES (1)

INSERT @A VALUES ('a')

INSERT @B VALUES (1)

INSERT @B VALUES ('a') -- *** added this row so the join passes the value up the tree

SELECT AVG(CONVERT(int, a))

FROM @A A

JOIN @B B ON B.b = A.a

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi

Sergiy SSC Guru Points: 110209 More actions · Answer 14

SQL Kiwi (1/12/2012)
With the JOIN in place, the row is eliminated by the join (no matching row in table B) before the expression is evaluated by the aggregate, so no error occurs.

Thank you for confirming my point!

That's exactly what I'm trying to prove:

JOIN is executed (eliminating the extra row) BEFORE aggregation even starts to evaluate any upcoming values.

Are you ready to correct the answer on your QoD?

😉

_____________
Code for TallyGenerator

Paul White SSC Guru Points: 150468 More actions · Answer 15

Sergiy (1/12/2012)
That's exactly what I'm trying to prove:
JOIN is executed (eliminating the extra row) BEFORE aggregation even starts to evaluate any upcoming values.

You are confusing the point in time at which an iterator first starts executing, with later times when rows flow, generally one at a time, up the plan. The aggregate starts executing first, as I have already proved. You really need to read those articles I linked to, to clarify your thinking here.

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi