RE: complicated sql syntax - looking for explanation for working query

SSC Guru

Points: 100947

May 7, 2014 at 8:30 am

Jeff Moden (5/6/2014)
KoldCoffee (5/6/2014)
OK. I'm interested.
In that case, here we go… let’s test the “Original” code, Scott’s code, and my humble submittal. With any luck at all, someone will come along with a method to beat all three methods but I’ll be able to get a point across, first. 😉
Just as a side note, we’re not testing any Scalar or mTVF (Multi-Statement Table Valued Functions) or rCTE’s, or While Loops so we can get away with using SET STATISTICS to measure performance in these tests. For more information on how badly the use of SET STATISTICS can skew performance measurements, please see the following article. It’s worth the read for more than 1 reason even if I do say so myself.
http://www.sqlservercentral.com/articles/T-SQL/91724/
[font="Arial Black"]Original Code[/font]
Pick your favorite test database and run the following code making sure that the Actual Execution plan is turned off (because it does add to duration and has some overhead of its own). It will run the original code 3 times just to make sure we don’t hit a speed bump during testing.
PRINT '================================================================================';
PRINT '===== Original Test';
SET STATISTICS TIME, IO ON;
;WITH e1(n) AS(
SELECT * FROM (VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1))x(n)
),
e2(n) AS(
SELECT e1.n FROM e1, e1 x
),
e4(n) AS(
SELECT e2.n FROM e2, e2 x
),
e8(n) AS(
SELECT e4.n FROM e4, e4 x
),
cteTally(n) AS(
SELECT TOP 6307204 ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) - 1
FROM e8
),
Test(n) AS(
SELECT DATEADD( minute, n, DATEADD( YEAR, -2, GETDATE()))
FROM cteTally)
select DATEADD(mi,datediff(mi,0,n),0)as cte_start_date
INTO dbo.OriginalTest
FROM Test
WHERE n <= DATEADD( YEAR, 4, GETDATE())
SET STATISTICS TIME, IO OFF;
DROP TABLE dbo.OriginalTest;
GO 3
I checked the results in the table that it created (which is dropped in the code above) and we’re OK there. Here are the displayable run results.
Beginning execution loop
================================================================================
===== Original Test
SQL Server Execution Times:
CPU time = 2668 ms, elapsed time = 2669 ms.
(3155041 row(s) affected)
================================================================================
===== Original Test
SQL Server Execution Times:
CPU time = 2652 ms, elapsed time = 2726 ms.
(3155041 row(s) affected)
================================================================================
===== Original Test
SQL Server Execution Times:
CPU time = 2637 ms, elapsed time = 2694 ms.
(3155041 row(s) affected)
Batch execution completed 3 times.
Heh… you’ve just got to love these modern 64 bit, multiple high performance processor laptops with more memory than all my previous computers combined. My old 32 bit box with a gig of memory took more than 10 seconds to run the same thing. But, as they say, I digress. Back to the problem.
Turn on the Actual Execution plan and run the code again. Look at the arrows closely… especially the one just to the left of the left-most nested loop formed by the cascading CTE. If you click on that arrow, you’ll see an actual rowcount of 6,307,204, which is about twice as many rows that we need and it’s because of that same hardcoded value in the code.
I won’t take the time to rewrite the code but if we change that large number to 3,155,041, which is precisely the number of rows we need, here are the run results.
Beginning execution loop
================================================================================
===== Original Test
SQL Server Execution Times:
CPU time = 2043 ms, elapsed time = 2093 ms.
(3155041 row(s) affected)
================================================================================
===== Original Test
SQL Server Execution Times:
CPU time = 1966 ms, elapsed time = 2005 ms.
(3155041 row(s) affected)
================================================================================
===== Original Test
SQL Server Execution Times:
CPU time = 1965 ms, elapsed time = 1999 ms.
(3155041 row(s) affected)
Batch execution completed 3 times.
Not bad. We got about a 34% increase in performance from that. We can do better, though.
[font="Arial Black"]Scott’s Code[/font]
Let’s take a look at Scott’s code next. And, no... I'm absolutely not picking on Scott. He's using code that I've seen dozens of times before by other people and I just want to take the opportunity to tell people to be extremely leery of such code.
Again, pick your favorite test database and run the following code making sure that the Actual Execution plan is turned off.
PRINT '================================================================================';
PRINT '===== Scott Test';
SET STATISTICS TIME, IO ON;
;WITH
cteDigits AS (
SELECT 0 AS digit UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL
SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9
),
cteTally AS (
SELECT
[1000000s].digit * 1000000 + [100000s].digit * 100000 + [10000s].digit * 10000 +
[1000s].digit * 1000 + [100s].digit * 100 + [10s].digit * 10 + [1s].digit AS minutes_increment
FROM cteDigits [1s]
CROSS JOIN cteDigits [10s]
CROSS JOIN cteDigits [100s]
CROSS JOIN cteDigits [1000s]
CROSS JOIN cteDigits [10000s]
CROSS JOIN cteDigits [100000s]
CROSS JOIN cteDigits [1000000s]
WHERE
[1000000s].digit * 1000000 + [100000s].digit * 100000 + [10000s].digit * 10000 +
[1000s].digit * 1000 + [100s].digit * 100 + [10s].digit * 10 + [1s].digit
--limit rows generated to 6 yrs' worth of minutes
<= DATEDIFF(MINUTE, '19000101', '19060101')
),
cteTime_by_minute AS (
SELECT DATEADD(MINUTE, minutes_increment, DATEADD(YEAR, -2, DATEADD(MINUTE, DATEDIFF(MINUTE, 0, GETDATE()), 0)))
AS time_by_minute
FROM cteTally t
)
SELECT time_by_minute
INTO dbo.ScottTest
FROM cteTime_by_minute
SET STATISTICS TIME, IO OFF;
DROP TABLE dbo.ScottTest;
GO 3
Here are the displayable results…
Beginning execution loop
================================================================================
===== Scott Test
Table 'Worktable'. Scan count 2, logical reads 82316, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 11529 ms, elapsed time = 4834 ms.
(3155041 row(s) affected)
================================================================================
===== Scott Test
Table 'Worktable'. Scan count 2, logical reads 82316, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 11762 ms, elapsed time = 4939 ms.
(3155041 row(s) affected)
================================================================================
===== Scott Test
Table 'Worktable'. Scan count 2, logical reads 82316, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 11731 ms, elapsed time = 4946 ms.
(3155041 row(s) affected)
Batch execution completed 3 times.
The first question to answer is how is it that CPU time could be longer than elapsed time? The answer is “Parallelism”. If you turn on the Actual Execution plan and run the code again, you’ll see multiple areas where Parallelism came into play.
The code took almost twice as long to execute and used almost 4 times the amount of CPU to do the same job. Why did it take so long and use so much CPU? If you look at the left-most nested loop in the execution plan, you’ll see a rather large arrow coming up to it from below and that large arrow has 10,000,000 rows in it. Even though those rows aren’t exposed externally, they do take time to build even on today’s fast machines.
Another problem is where did the 82,316 reads come from (those are PAGES of I/O which adds up to 674,332,672 bytes of I/O)? Someone like Grant Fritchey, Paul White, or Gail Shaw might be able to explain why it formed but the reads come from the “Lazy Table Spool” at the bottom of that same large arrow.
Why did all of this happen? Because the code has no way of determining what the maximum number it should count to is. It first calculates all 10 million numbers and then filters out all but the 3 million or so that we actually need.
What’s the fix for this code? Rewrite it so it doesn’t calculate all 10 million rows and I don’t mean simply by limiting the largest number it can count to and still filtering out what we don’t want.
[font="Arial Black"]Jeff’s Code[/font]
I believe it was Scott that said that code should be mostly self-documenting and names should be used to properly represent what is being done and what each object contains. I absolutely agree with that but I also believe in the power of well written comments because they just make life easier all the way around. Yes, you can read the code to find out what’s going on (or not. The cascading CTE {for example} would take anyone a bit of time to figure out if they've never seen it before) but it’s so much easier to first find a comment that identifies the section of code that addresses a particular problem and then read just that code. For functions, I document them as if I were writing them into Books Online. For procs, I believe that you should be able to remove all of the code and the comments that remain should be good enough for a functional flowchart that someone could rewrite the code from. Of course, that’s just my opinion. Some folks think that comments get in the way. I’ll let you be the judge of that… well, unless you work in my shop and then you’ll need to get really good at commenting code. 😉
Being an old front-end programmer from long ago (hung those shoes up in 2002), I still believe in making life easy through the use of proper encapsulation and abstraction. With that in mind, I built a Tally-Table-like function (others have done similar... not my original idea) using a version of Itzik Ben-Gan’s fine cascading CTE’s to do the job. I’m also one that doesn’t believe in “Hungarian Notation” prefixes for most things but, since I also keep a Tally Table active in the same databases, I couldn’t call the function just “Tally”. I also didn’t want to change the name much because I didn’t want it to be difficult to remember so, I broke my “Hungarian Notation” rule for this one function and called it “fnTally”. It also allows this old sailor to have a bit of fun without it being an HR violation. When someone has a performance problem with some form of sequence code, I ask them if they used a Recursive CTE, While Loop, or Cursor as the sequence generator. If they say “yes”, then I can say “If you used the fnTally function like I said, you wouldn’t have that problem”. It allows me to vent without getting into trouble with HR. 😉 If you don't get that joke, it's ok. Thankfully, HR doesn't either. 😛
On with the show. Here’s the fnTally function that I currently use in most of my databases. We do need it for the code that follows. I’d tell you how to use it and what the inputs mean and some things to watch out for and how to get around those things but, guess what? ITS COMMENTED! 😉
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE FUNCTION [dbo].[fnTally]
/**********************************************************************************************************************
Purpose:
Return a column of BIGINTs from @ZeroOrOne up to and including @MaxN with a max value of 1 Billion.
As a performance note, it takes about 00:02:10 (hh:mm:ss) to generate 1 Billion numbers to a throw-away variable.
Usage:
--===== Syntax example (Returns BIGINT)
SELECT t.N
FROM dbo.fnTally(@ZeroOrOne,@MaxN) t
;
Notes:
1. Based on Itzik Ben-Gan's cascading CTE (cCTE) method for creating a "readless" Tally Table source of BIGINTs.
Refer to the following URL for how it works. http://sqlmag.com/sql-server/virtual-auxiliary-table-numbers
2. To start a sequence at 0, @ZeroOrOne must be 0 or NULL. Any other value that's convertable to the BIT data-type
will cause the sequence to start at 1.
3. If @ZeroOrOne = 1 and @MaxN = 0, no rows will be returned.
5. If @MaxN is negative or NULL, a "TOP" error will be returned.
6. @MaxN must be a positive number from >= the value of @ZeroOrOne up to and including 1 Billion. If a larger
number is used, the function will silently truncate after 1 Billion. If you actually need a sequence with
that many values, you should consider using a different tool. ;-)
7. There will be a substantial reduction in performance if "N" is sorted in descending order. If a descending
sort is required, use code similar to the following. Performance will decrease by about 27% but it's still
very fast especially compared with just doing a simple descending sort on "N", which is about 20 times slower.
If @ZeroOrOne is a 0, in this case, remove the "+1" from the code.
DECLARE @MaxN BIGINT;
SELECT @MaxN = 1000;
SELECT DescendingN = @MaxN-N+1
FROM dbo.fnTally(1,@MaxN);
8. There is no performance penalty for sorting "N" in ascending order because the output is explicity sorted by
ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
Revision History:
Rev 00 - Unknown - Jeff Moden
- Initial creation with error handling for @MaxN.
Rev 01 - 09 Feb 2013 - Jeff Moden
- Modified to start at 0 or 1.
Rev 02 - 16 May 2013 - Jeff Moden
- Removed error handling for @MaxN because of exceptional cases.
**********************************************************************************************************************/
(@ZeroOrOne BIT, @MaxN BIGINT)
RETURNS TABLE WITH SCHEMABINDING AS
RETURN WITH
E1(N) AS (SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1) --10E1 or 10 rows
, E3(N) AS (SELECT 1 FROM E1 a, E1 b, E1 c) --10E3 or 1 Thousand rows
, E9(N) AS (SELECT 1 FROM E3 a, E3 b, E3 c) --10E9 or 1 Billion rows
SELECT N = 0 WHERE ISNULL(@ZeroOrOne,0)= 0 --Conditionally start at 0.
UNION ALL
SELECT TOP(@MaxN) N = ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E9 -- Values from 1 to @MaxN
;
And here’s my rendition of the code on how to solve this problem. I tend to calculate any “run time” constants up front. It allows some simplification and keeps all such things in one place. Run the code the same way you did the other making sure that the Actual Execution plan is turned off.
PRINT '================================================================================';
PRINT '===== Jeff Test';
SET STATISTICS TIME, IO ON;
WITH
cteStartMinute AS
( --=== Precalculate the start date/time of a whole minute two years ago.
-- The "0s" are the same as '1900-01-01'.
SELECT StartDTMinute = DATEADD(yy,-2,DATEADD(mi,DATEDIFF(mi,0,GETDATE()),0))
)
,cteMinuteSpan AS
( --=== Precalculate how many minutes there are from 2 years ago to 4 years from now (6 years total).
SELECT StartDtMinute
,Minutes = DATEDIFF(mi,StartDTMinute,DATEADD(yy,6,StartDTMinute))
FROM cteStartMinute
)
--===== Use the fnTally function to provide the number of minutes to add.
-- Note that we're using the function in the "start at 0" mode.
SELECT DtByMinute = DATEADD(mi, t.N, m.StartDTMinute)
INTO dbo.JeffTest
FROM cteMinuteSpan m
CROSS APPLY (SELECT N FROM dbo.fnTally(0,m.Minutes)) t (N)
;
SET STATISTICS TIME, IO OFF;
DROP TABLE dbo.JeffTest;
GO 3
Here are the displayable results…
Beginning execution loop
================================================================================
===== Jeff Test
SQL Server Execution Times:
CPU time = 1388 ms, elapsed time = 1412 ms.
(3155041 row(s) affected)
================================================================================
===== Jeff Test
SQL Server Execution Times:
CPU time = 1389 ms, elapsed time = 1427 ms.
(3155041 row(s) affected)
================================================================================
===== Jeff Test
SQL Server Execution Times:
CPU time = 1342 ms, elapsed time = 1386 ms.
(3155041 row(s) affected)
Batch execution completed 3 times.
OK… big deal, right? On today’s machines, the differences across more than 3 million rows are measured as a couple of seconds.
We wrote code that runs 2 to almost 4 times faster, uses up to 89% less CPU, produces no reads, and is a bit shorter (hopefully, easier to read, as well), which also means easier to modify or troubleshoot in the future. And if someone is looking for some code that’s similar and they find this code and use it for something that will be called 10’s of thousands of times per day, how much of a machine would you need to handle dozens of similar routines?
Like Granny used to say, “Mind the pennies and the dollars will take care of themselves.” 😉

I openly stated that the intent of the code I posted was not pure efficiency:

Of course you may prefer to adjust the CROSS JOINs for efficiency and/or adjust the names, but make the names meaningful.

But to avoid awful names like "N", "E", etc.. Reams of comments have to be cached as well, and must be re-read every time the code is analyzed by anyone. Moreover, all developers know that ultimately you can never trust comments but must rely on the code itself.

SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.