SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Execution plan


Execution plan

Author
Message
tommyh
tommyh
SSCommitted
SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)

Group: General Forum Members
Points: 1782 Visits: 2000
An article by Jeff Moden (http://www.sqlservercentral.com/articles/T-SQL/71511) got me really confused and instead of hijacking that thread i thought i would start a new one.

I had planed on asking why not using a second group by instead of the odd order by, like

SELECT [Month] = DATENAME(mm,SomeDateTime),
DatePart(mm, SomeDateTime),
Amount = SUM(SomeAmount)
FROM #MyHead
WHERE SomeDateTime >= '2010' AND SomeDateTime < '2011'
GROUP BY DATENAME(mm,SomeDateTime), DatePart(mm, SomeDateTime)
ORDER BY DatePart(mm, SomeDateTime)
;



It looked to me to be as effective or even better since i "lost" an extra "Compute Scalar". However when i removed the where clause (to force a bit bigger result) it showed that Jeffs was faster by something like 20%. That wasnt what i found strange. It was the execution plan.

Just looking at the numbers produced i would have guessed that the second one would be faster. The CPU and I/O are identical (on the matching parts). But the first query has an extra operation. So how the heck can it be faster?

Running on:
Microsoft SQL Server 2005 - 9.00.1399.06 (Intel X86) Oct 14 2005 00:33:37 Copyright (c) 1988-2005 Microsoft Corporation Developer Edition on Windows NT 5.2 (Build 3790: Service Pack 1)

PS I boosted the nr of rows in Jeffs temptable to 10 000 000.

/T
Attachments
Plan1.sqlplan (20 views, 13.00 KB)
Plan2.sqlplan (14 views, 13.00 KB)
Paul White
Paul White
SSCoach
SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)

Group: General Forum Members
Points: 15912 Visits: 11355
There are a number of differences, but the three most important are:

1. In the Compute Scalar immediately above the Clustered Index Scan, the plan with the extra GROUP BY calculates:

[Expr1004] = Scalar Operator(datename(month,[tempdb].[dbo].[#MyHead].[SomeDateTime]))
[Expr1005] = Scalar Operator(datepart(month,[tempdb].[dbo].[#MyHead].[SomeDateTime]))


whereas the plan with the single GROUP BY only calculates:

[Expr1004] = Scalar Operator(datename(month,[tempdb].[dbo].[#MyHead].[SomeDateTime]))


2. The extra DATEPART computation may not seem much, but it is repeated 10,000,000 times. The extra expression also increases the total amount of data flowing into the Hash Match (Aggregate) that comes next. Again, the per-row size of the DATEPART expression (integer) is only four bytes, but multiplied by ten million, it adds up. The total row size estimates show an increase from 467MB to 505MB, which is 38MB or 8% extra.

3. The Hash Match aggregate implements the GROUP BY expression(s). The double GROUP BY, as well as having an extra 38MB of input, performs a more complex hash match on [Expr1004],[Expr1005] versus [Expr1004] alone.

Of course it is pretty obvious to a human that DATEPART and DATENAME in this context will group in exactly the same way (they are based on the same date, and use the same MONTH granularity). Each possible value of DATEPART is associated with exactly one value of DATENAME. This one-to-one mapping means that we only need to group on DATEPART or DATENAME, not both.

The query optimizer does not include logic to reason that way, so it performs the literal double grouping. (Another consideration here is that DATENAME is non-deterministic).

I hope that helps.

Paul



Paul White
SQLPerformance.com
SQLblog.com
@SQL_Kiwi
Dave Ballantyne
Dave Ballantyne
SSCrazy
SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)

Group: General Forum Members
Points: 2934 Visits: 8370
I believe the difference here is in the hash match operation , in your version it is matching on 2 columns and in Jeff's 1 column.

Hash matching is a comparatively expensive operation and the pure number of scalar operations executed is a bit of a red herring.

-- Though Paul has done much better explanation ;-)



Clear Sky SQL
My Blog
tommyh
tommyh
SSCommitted
SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)

Group: General Forum Members
Points: 1782 Visits: 2000
Thank you both.

I did figure it would be partly because of the much larger dataset. I had just assumed that something in the execution plan regarding cost (CPU/IO/Operator) would differ more to indicate that one would be better then the other. The hashmatch differs between 66.9272 vs 66.9432 which to me feels insignificant. But i guess there is more to decifering an execution plan then meets the eye.

/T
Dave Ballantyne
Dave Ballantyne
SSCrazy
SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)

Group: General Forum Members
Points: 2934 Visits: 8370
Be careful when comparing costs within the plan , they are ALL estimated costs. The actual costs could differ dramatically.



Clear Sky SQL
My Blog
Paul White
Paul White
SSCoach
SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)

Group: General Forum Members
Points: 15912 Visits: 11355
tommyh (11/15/2010)
I did figure it would be partly because of the much larger dataset. I had just assumed that something in the execution plan regarding cost (CPU/IO/Operator) would differ more to indicate that one would be better then the other.

The extra cost is primarily in the extra 10 million DATEPART calculations in the Compute Scalar. I listed the other effects because I thought you were after a comprehensive explanation. The exact reasons for this are complex, and to do with the way Compute Scalars are implemented, and the fact that the query plan contains a scalar hash aggregate - as opposed to a hash join.

The reason you don't see much difference in the costs of the Compute Scalars in the two plans is because the cost model doesn't really do much costing with Compute Scalars. See http://www.sqlskills.com/BLOGS/CONOR/post/SQL-Server-Scalar-Operator-Costing-aka-umm-what-costing-.aspx.

The hashmatch differs between 66.9272 vs 66.9432 which to me feels insignificant. But i guess there is more to decifering an execution plan then meets the eye.

In this case, the estimated costs are not too far off - the extra hash effort is pretty small. By the way, are you able to run this same query on a machine with many cores? The difference parallelism makes would be interesting - that's my bet anyway, but I am often wrong ;-)



Paul White
SQLPerformance.com
SQLblog.com
@SQL_Kiwi
Paul White
Paul White
SSCoach
SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)

Group: General Forum Members
Points: 15912 Visits: 11355
Dave Ballantyne (11/15/2010)
I believe the difference here is in the hash match operation , in your version it is matching on 2 columns and in Jeff's 1 column. Hash matching is a comparatively expensive operation and the pure number of scalar operations executed is a bit of a red herring.

Sorry Dave I hadn't read this one when I posted my last - apologies if it sounds 'off' - just a timing thing, I promise. Though I do disagree with your assessment ;-)



Paul White
SQLPerformance.com
SQLblog.com
@SQL_Kiwi
deepak_19888
deepak_19888
Forum Newbie
Forum Newbie (8 reputation)Forum Newbie (8 reputation)Forum Newbie (8 reputation)Forum Newbie (8 reputation)Forum Newbie (8 reputation)Forum Newbie (8 reputation)Forum Newbie (8 reputation)Forum Newbie (8 reputation)

Group: General Forum Members
Points: 8 Visits: 52
Hi Paul

I am trying to learn the execution plan and aims to become a DBA.

In the above thread i did tried to understand the two execution plans and did get some success but failed to understand how to know which 'compute scalar' computes what expression.

and why the sorting operator has different sequence in both the plans.

Grateful if you could throw some light on the above issues.

Thanks
Deepak !
deepak_19888
deepak_19888
Forum Newbie
Forum Newbie (8 reputation)Forum Newbie (8 reputation)Forum Newbie (8 reputation)Forum Newbie (8 reputation)Forum Newbie (8 reputation)Forum Newbie (8 reputation)Forum Newbie (8 reputation)Forum Newbie (8 reputation)

Group: General Forum Members
Points: 8 Visits: 52
Hi Paul

I am trying to learn the execution plan and aims to become a DBA.

In the above thread i did tried to understand the two execution plans and did get some success but failed to understand how to know which 'compute scalar' computes what expression.

and why the sorting operator has different sequence in both the plans.

Grateful if you could throw some light on the above issues.

Thanks
Deepak !
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search