# Generating n-Tuples with SQL

• dwain.c (5/17/2012)

An example would be a grocer who wishes to assemble for sale gift baskets consisting of various combinations of items that fall within a total cost range (so that he can sell the basket for a specific price and a profit margin). All you need to do is add your cost column and limit the results to those combinations that meet the combined cost criteria. Or add additional criteria to consider number of items (e.g., no less than 6) or cost/profit balancing.

Now we'll show the solution for the grocer. But not just any grocer. Just for fun, we'll choose one who caters specifically to Wesen (http://en.wikipedia.org/wiki/Creatures_of_Grimm).

Six or more products in a cost range of \$80-\$90.

`DECLARE @t TABLE (product VARCHAR(30), cost MONEY)`

`INSERT INTO @t (product, cost)`

`SELECT 'Eyes of Newt', 15`

`UNION ALL SELECT 'Tongues of Toad', 18`

`UNION ALL SELECT 'Serpents'' Fangs', 17`

`UNION ALL SELECT 'Salamanders'' Tails', 21`

`UNION ALL SELECT 'Black Cats'' Whiskers', 11`

`UNION ALL SELECT 'Pangolin Scales', 17`

`UNION ALL SELECT 'Fertilized Egg of Platypus', 12`

`UNION ALL SELECT 'Horse Muffins', 14`

`;WITH UNIQUEnTuplesCost (n, Tuples, Cost) AS (`

` SELECT DISTINCT 1, CAST(product AS VARCHAR(max)), cost`

` FROM @t`

` UNION ALL`

` SELECT 1 + n.n, t.product + ', ' + n.Tuples, n.cost + t.cost`

` FROM @t t JOIN UNIQUEnTuplesCost n ON t.product < n.Tuples`

` WHERE CHARINDEX(t.product, n.Tuples) = 0`

`)`

`SELECT n, Cost, Tuples`

` FROM UNIQUEnTuplesCost`

` WHERE n >= 6 and Cost BETWEEN 80 and 90`

` ORDER BY n, Cost`

Runs in 31ms (CPU) and gives these results (:-P):

`nCostTuples`

`686.00Black Cats' Whiskers, Eyes of Newt, Fertilized Egg of Platypus, Horse Muffins, Pangolin Scales, Serpents' Fangs`

`687.00Black Cats' Whiskers, Eyes of Newt, Fertilized Egg of Platypus, Horse Muffins, Pangolin Scales, Tongues of Toad`

`687.00Black Cats' Whiskers, Eyes of Newt, Fertilized Egg of Platypus, Horse Muffins, Serpents' Fangs, Tongues of Toad`

`689.00Black Cats' Whiskers, Fertilized Egg of Platypus, Horse Muffins, Pangolin Scales, Serpents' Fangs, Tongues of Toad`

`690.00Black Cats' Whiskers, Eyes of Newt, Fertilized Egg of Platypus, Pangolin Scales, Serpents' Fangs, Tongues of Toad`

`690.00Black Cats' Whiskers, Eyes of Newt, Fertilized Egg of Platypus, Horse Muffins, Salamanders' Tails, Serpents' Fangs`

`690.00Black Cats' Whiskers, Eyes of Newt, Fertilized Egg of Platypus, Horse Muffins, Pangolin Scales, Salamanders' Tails`

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

My thought question: Have you ever been told that your query runs too fast?

INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

• Crud. It's not my day. I have to be honest even if it's not going to look so good for me. :blush: I don't know if I copied/typed something wrong before or what (lost power overnight and hadn't saved it) but I can't make the simple (same logic) While Loop conversion beat the rCTE in this case. I swear I did yesterday but simply cannot prove it so I'll eat the crow on this one.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

• Jeff Moden (5/17/2012)

Crud. It's not my day. I have to be honest even if it's not going to look so good for me. :blush: I don't know if I copied/typed something wrong before or what (lost power overnight and hadn't saved it) but I can't make the simple (same logic) While Loop conversion beat the rCTE in this case. I swear I did yesterday but simply cannot prove it so I'll eat the crow on this one.

Honestly, the horse muffins probably tasted better than crow...

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

My thought question: Have you ever been told that your query runs too fast?

INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

• Not so sure about that. I've beat the rCTE for this problem for CPU and absolutely smoked it for reads just by doing the "simple" conversion to a While Loop. It took about 5 minutes to write the code from the rCTE model. I just can't get the "simple" conversion to beat it for duration in this case although it comes close. I don't know what others think but duration is what the customer perceives and that makes duration one of the most important measurements from where I stand.

I'm happy that folks are able to solve complex problems in such a short time with rCTE's (takes about the same time with a While loop to solve). I am concerned for it all though. Some rCTE's are going to turn out fine and others are going to eat the guts out of the server even if they do appear to be quick (and not all will appear so). For example, the n-Tuple rRCTE uses about 3 times as many reads from a Temp table that it builds behind the scenes as the While Loop does. While one such rCTE may not kick TempDB over the fence, a bunch could. Yep... I agree... if you have resources, you should use them. I'm just not sure that using 3 times as many as what are really needed is a prudent thing to do.

My advice stands. "It Depends". Nothing in SQL Server is a panacea of performance and I've seen more problems with rCTEs than not especially when it comes to reads. Even logical reads becomes a problem when you have 3 times as many as another process. If you use an rCTE, please be sure to test it for scalability because it might not be (see the following article for an example of when they might not be). Even then, try another method against the problem. You might be surprised at what you can come up with. Or not. π

http://www.sqlservercentral.com/articles/T-SQL/74118/

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

• Jeff Moden (5/17/2012)

Not so sure about that. I've beat the rCTE for this problem for CPU and absolutely smoked it for reads just by doing the "simple" conversion to a While Loop. It took about 5 minutes to write the code from the rCTE model. I just can't get the "simple" conversion to beat it for duration in this case although it comes close. I don't know what others think but duration is what the customer perceives and that makes duration one of the most important measurements from where I stand.

Duration is often my primary tuning metric. Certainly there are other considerations, like CPU usage, I/O, workspace memory, read-ahead utilization, and buffer pool impact, but duration is often the most important. That said, predictability is also important - a method may be less attractive if it has awesome best-case performance but very poor worst-case performance.

Jeff Moden (5/17/2012)

I'm happy that folks are able to solve complex problems in such a short time with rCTE's (takes about the same time with a While loop to solve). I am concerned for it all though. Some rCTE's are going to turn out fine and others are going to eat the guts out of the server even if they do appear to be quick (and not all will appear so).

No argument with this. Recursive CTEs are only occasionally the 'best' solution.

Jeff Moden (5/17/2012)

For example, the n-Tuple rRCTE uses about 3 times as many reads from a Temp table that it builds behind the scenes as the While Loop does. While one such rCTE may not kick TempDB over the fence, a bunch could. Yep... I agree... if you have resources, you should use them. I'm just not sure that using 3 times as many as what are really needed is a prudent thing to do.

I always say this, but it is worth repeating. Logical I/O (reads) are reported differently for many work tables. For a normal table (including user-defined temporary tables), logical I/O is reported in 8KB pages (and very many rows can fit on a single page). Internal work tables are not accessible directly by users, so reporting logical I/O page counts would not be very useful. For that reason, STATISTICS IO reports logical I/O on most work tables in rows not pages. Unless you are aware of this, it can seem that work tables are performing way more logical reads than is actually the case. None of this should be interpreted as saying spool performance can never be a problem - it can - but we have to compare apples with apples.

• @ Dwain, grocer example and all of a sudden the lightbulb went on! π

Got it now, thanks:cool:

(Maybe yesterday I was just working to hard, and didnt focus!)

• The above debate between Jeff and Paul is precisely the kind of discussion that I was hoping that my article would generate. I am particularly interested in seeing the looping solution with the longer elapsed times, because I think it would be useful in solving the business problems spawned from the seemingly theoretical example covered in my article. I looked carefully at my elapsed times and wondered why thy were increasing seemingly much faster than the CPU, thinking that the likely cause was IO and rendering the final rowset back to the results pane. However I am insufficiently experienced to interpret the results I was seeing.

chris.stuart (5/17/2012)

@ Dwain, grocer example and all of a sudden the lightbulb went on! π

Got it now, thanks:cool:

(Maybe yesterday I was just working to hard, and didnt focus!)

I'm coming to the opinion that the biggest shortfall of the article was not to include the business cases for application within the article itself. That may have generated even more interest and led to further interesting discourse in this discussion thread.

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

My thought question: Have you ever been told that your query runs too fast?

INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

• Ok... since no one else took the testing suggestion...

`--=====================================================================================================================`

`-- Test setup`

`--=====================================================================================================================`

`--===== Enable the auto-display of rowcounts just to get a little feedback on the test table.`

` SET NOCOUNT OFF;`

`--===== Conditionally drop the test table to make reruns easier. I use a Temp Table instead of a Table Variable here`

` -- because I want to measure batches separated by GO using SQL Profiler.`

` IF OBJECT_ID('tempdb..#T','U') IS NOT NULL DROP TABLE #T;`

`--===== Create the test table`

` SELECT TOP (10) --<---<<<< Change this number to vary the number of values in StrCol.`

` StrCol = CAST(RIGHT('000'+CAST(ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS VARCHAR(3)),3) AS CHAR(3))`

` INTO #t`

` FROM sys.all_columns`

`;`

`--===== Conditionally drop the result tables to make reruns easier.`

` IF OBJECT_ID('tempdb..#R1','U') IS NOT NULL DROP TABLE #R1;`

` IF OBJECT_ID('tempdb..#R2','U') IS NOT NULL DROP TABLE #R2;`

`--===== Disable the auto-display of rowcounts like I normally would in a stored procedure.`

` SET NOCOUNT ON;`

`--=====================================================================================================================`

`-- Run the test with clear markers that show up in SQL Profiler.`

`--=====================================================================================================================`

`GO`

`--===== Recursive CTE =================================================================================================`

` WITH UNIQUEnTuples (n, Tuples)`

` AS`

`(`

` SELECT DISTINCT 1, CAST(strcol AS VARCHAR(max))`

` FROM #T`

` UNION ALL`

` SELECT 1 + n.n, t.strcol + ',' + n.Tuples`

` FROM #T t JOIN UNIQUEnTuples n`

` ON t.strcol < n.Tuples`

` WHERE CHARINDEX(t.strcol, n.Tuples) = 0`

`)`

` SELECT *`

` INTO #R1`

` FROM UNIQUEnTuples`

`;`

`GO`

`--===== While Loop and Temp Table =====================================================================================`

` -- Note that I'm not proposing this as "THE" replacement for an rCTE. The single purpose of this test is to`

` -- simply compare a the given rCTE to a nearly identical While Loop.`

`DECLARE @Count INT,`

` @Counter INT`

`;`

`--===== Replacement for the "Anchor" section of the rCTE.`

` SELECT N = 1,`

` Tuples = CAST(strcol AS VARCHAR(800))`

` INTO #UniqueNTuples`

` FROM #T`

`;`

` SELECT @Count = @@ROWCOUNT-1,`

` @Counter = 1`

`;`

`--===== Replacement for the "Recursive" section the rCTE`

` WHILE @Counter <= @Count`

` BEGIN`

` INSERT INTO #UniqueNTuples`

` (N, Tuples)`

` SELECT DISTINCT n.n+1, t.strcol + ',' + n.Tuples`

` FROM #t t JOIN #UniqueNTuples n`

` ON t.strcol < n.Tuples`

` AND n.n = @Counter`

` AND CHARINDEX(t.strcol, n.Tuples) = 0;`

` SELECT @Counter = @Counter + 1;`

` END`

`;`

`GO`

I didn't document a full spectrum test like Dwain did because I was actually hoping someone besides me would do it for a change. π My anecdotal observations have been that the rCTE does a pretty good job for duration and the While loop gets pretty close in a lot of instances but doesn't beat it like I've seen in other problem examples. The code is setup to make some pretty easy to read output in SQL Profiler if you set it up to record batches (it does have a While Loop in the code which is why I setup for batches).

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

• dwain.c (5/17/2012)

The above debate between Jeff and Paul...

I'm not looking at it as a debate. It's just the normal kind of discussion two wireheads have about different things. π

Shifting gears...

Paul, where did you get the information about what the logical reads are by table type?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

• Jeff Moden (5/18/2012)

Paul, where did you get the information about what the logical reads are by table type?

I wanted to give credit to the person that first wrote about this, but I have failed to remember who it was (it was years ago).

• Jeff Moden (5/18/2012)

Ok... since no one else took the testing suggestion...

I didn't document a full spectrum test like Dwain did because I was actually hoping someone besides me would do it for a change. π My anecdotal observations have been that the rCTE does a pretty good job for duration and the While loop gets pretty close in a lot of instances but doesn't beat it like I've seen in other problem examples. The code is setup to make some pretty easy to read output in SQL Profiler if you set it up to record batches (it does have a While Loop in the code which is why I setup for batches).

Probably no one else watching is as concerned about performance as you are. π Either that or they didn't see an alternative solution.

Yours is certainly interesting. I thought you might try to decompose the anchor/recursive CTE legs but honestly couldn't figure out how you'd go about it. Now I know.

A question though. I notice your solution is INSERTing all its results into the temp table, while mine is SELECTing them back to results. Isn't that kinda like comparing apples to oranges? At least, it sounds like something GSquared (Gus?) once said to me.

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

My thought question: Have you ever been told that your query runs too fast?

INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

• If you want to make a rCTE run really quickly, then minimise the amount of "r" in it...

`;WITH rCTE AS`

`(`

` SELECT [level] = 0`

` FROM (VALUES (0)) d (n)`

` UNION ALL`

` SELECT [level] = 1`

` FROM rCTE`

`CROSS JOIN (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) d (n)`

`CROSS JOIN (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) e (n)`

`CROSS JOIN (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) f (n)`

`CROSS JOIN (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) g (n)`

`CROSS JOIN (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) h (n)`

`WHERE [level] = 0`

`)`

`SELECT [level],`

`n = ROW_NUMBER() OVER(ORDER BY (SELECT PI()))`

`FROM rCTE`

This quaint snippet runs in (only) about 40% more time than the usual inline CTE tally table - which is itself no longer the fastest kid on the block.

βWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.β - Gail Shaw

Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

• ChrisM@Work (5/30/2012)

This quaint snippet runs in (only) about 40% more time than the usual inline CTE tally table - which is itself no longer the fastest kid on the block.

Hey so what is the fastest kid on the block these days?

• ChrisM@Work (5/30/2012)

If you want to make a rCTE run really quickly, then minimise the amount of "r" in it...

`;WITH rCTE AS`

`(`

` SELECT [level] = 0`

` FROM (VALUES (0)) d (n)`

` UNION ALL`

` SELECT [level] = 1`

` FROM rCTE`

`CROSS JOIN (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) d (n)`

`CROSS JOIN (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) e (n)`

`CROSS JOIN (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) f (n)`

`CROSS JOIN (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) g (n)`

`CROSS JOIN (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) h (n)`

`WHERE [level] = 0`

`)`

`SELECT [level],`

`n = ROW_NUMBER() OVER(ORDER BY (SELECT PI()))`

`FROM rCTE`

This quaint snippet runs in (only) about 40% more time than the usual inline CTE tally table - which is itself no longer the fastest kid on the block.

Thanks Chris! Another interesting contribution to my library of recursive CTE examples and an excellent suggestion to keep recursing to a minimum.

Thanks to you and SQL Kiwi for stopping by (again)!

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

My thought question: Have you ever been told that your query runs too fast?

INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

• dwain.c (5/18/2012)

A question though. I notice your solution is INSERTing all its results into the temp table, while mine is SELECTing them back to results. Isn't that kinda like comparing apples to oranges? At least, it sounds like something GSquared (Gus?) once said to me.

Not really. I made it so both the rCTE and the While loop would dump into a temp table... I wanted to give the rCTE a decent chance to keep up with the While Loop at the higher numbers. π

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.