# Generating n-Tuples with SQL

By Dwain Camps, 2012/05/17

I ran across an interesting problem the other day that required this and I thought I’d post on this topic of n-Tuples as a result.

First what is an n-Tuple?  Suppose you have a list of unique character strings like ‘1’ ‘2’ ‘3’ and these are stored as rows in a table. Tuples are simply combinations of these values, so 1-Tuples would be:

`1`

`2`

`3`

2-Tuples would be:

`1,2`

`1,3`

`2,1`

`2,3`

`3,1`

`3,2`

And of course there are these 3-Tuples:

`1,2,3`

`1,3,2`

`2,1,3`

`2,3,1`

`3,1,2`

`3,2,1`

Let’s now write a SQL query that generates Tuples for all n where we have unique strings we want to combine:

```DECLARE @t TABLE (strcol CHAR(3))

INSERT INTO @t (strcol)
SELECT '001' UNION ALL SELECT '002' UNION ALL SELECT '003'
--UNION ALL SELECT '004' UNION ALL SELECT '005' UNION ALL SELECT '006'
--UNION ALL SELECT '007'  UNION ALL SELECT '008'
--UNION ALL SELECT '009' UNION ALL SELECT '010'
;WITH nTuples (n, Tuples) AS (
SELECT DISTINCT 1, CAST(strcol AS VARCHAR(max))
FROM @t
UNION ALL
SELECT 1 + n.n, t.strcol + ',' + n.Tuples
FROM @t t JOIN nTuples n ON t.strcol <> n.Tuples
WHERE CHARINDEX(t.strcol, n.Tuples) = 0
)
SELECT *
FROM nTuples
ORDER BY n, Tuples```

This recursive CTE returns the following results set when run for n=3:

`n             Tuples`

`1              001`

`1              002`

`1              003`

`2              001,002`

`2              001,003`

`2              002,001`

`2              002,003`

`2              003,001`

`2              003,002`

`3              001,002,003`

`3              001,003,002`

`3              002,001,003`

`3              002,003,001`

`3              003,001,002`

`3              003,002,001`

Interestingly enough, that wasn’t really the result I needed but it was pretty close.  What I really needed was what I call “unique, ordered” n-Tuples.  In other words, if I list out the 2-Tuples from above, I can show you only the ones I need:

`1,2   [Want]`

`1,3   [Want]`

`2,1   [Don’t Want]`

`2,3   [Want]`

`3,1   [Don’t Want]`

`3,2   [Don’t Want]`

This means that 1,2 is equivalent to 2,1 but I want only want the first one where the data is ordered across the Tuple.  Likewise, there will then be only 1 3-Tuple that I’m interested in: 1,2,3

This actually ends up being a pretty simple change to the above CTE.  The not equals (<>) has been changed to less than (<).

```;WITH UNIQUEnTuples (n, Tuples) AS (
SELECT DISTINCT 1, CAST(strcol AS VARCHAR(max))
FROM @t
UNION ALL
SELECT 1 + n.n, t.strcol + ',' + n.Tuples
FROM @t t JOIN UNIQUEnTuples n ON t.strcol < n.Tuples
WHERE CHARINDEX(t.strcol, n.Tuples) = 0
)
SELECT *
FROM UNIQUEnTuples
ORDER BY n, Tuples```

The results from running this SQL with n=3 are:

`n             Tuples`

`1              001`

`1              002`

`1              003`

`2              001,002`

`2              001,003`

`2              002,003`

`3              001,002,003`

Thinking about it, I realized that the row sets generated by these bad boys are going to grow quite quickly!  After a little bit of playing around and without resorting to looking up the combinatorial statistics, I was able to deduce that the number of records in the respective row sets are generated by these formulas:

`n-Tuples: Sum over n + n*(n-1) + n*(n-1)*(n-2) + … + n*(n-1)*…*(n-(n-1))`

Unique, ordered, n-Tuples: 2^n-1

For the n-Tuples formula, you can verify in a spreadsheet that the last 2 factors in the series for n>=2 are n! and that, my intrepid reader, becomes a really big number very quickly!

Of course, being constantly performance-minded, I couldn’t help but wonder how well SQL would handle these huge row sets, so I set about to running my SQL across incrementing n’s.

For n-Tuples, I got these results (CPU and elapsed time are in milliseconds):

`n  #n-Tuples CPU    Elapsed`

`1  1         0      1`

`2  4         0      1`

`3  15        0      1`

`4  64        0      2`

`5  325       15     83`

`6  1956      47     202`

`7  13699     468    831`

`8  109600    3682   7033`

`9  986409    41809  116940`

`10 9864100   485054 1410456`

At n=10, I decided to stop because I’d be too impatient to run for n=11!

But you can take the unique, ordered, n-Tuples much further as the progression of increase in number of the rows in the row sets is much slower.

`n  #n-Tuples CPU    Elapsed`

`1  1         0      0`

`2  3         0      0`

`3  7         0      0`

`4  15        0      0`

`5  31        0      2`

`6  63        15     2`

`7  127       0      4`

`8  255       15     62`

`9  511       0      34`

`10 1023      16     70`

`11 2047      63     264`

`12 4095      109    218`

`13 8191      234    392`

`14 16383     453    799`

`15 32767     1030   1717`

`16 65535     2230   3711`

`17 131071    4555   8535`

`18 262143    10452  27458`

`19 524287    21871  58594`

`20 1048575   45802  135406`

`21 2097151   --     --`

`22 4194303   --     --`

`23 8388607   --     --`

Once again, I’ll be too impatient to wait for n=21 to complete!

There are some caveats to these functions that you should be aware of:

1. If there are any duplicates in your source table, these will be eliminated in the results (by design).
2. You need to take care if your data points look like A, AA, AAA.  Note how the source table has defined this column as CHAR(3).  This is also by design.  The CHARINDEX statement would remove the n-Tuples (where n>1) of this particular set if you used VARCHAR(3) instead.  So if you’re working with integers you want n-Tuples of, your best course of action is to left pad them with 0’s or right pad them with blanks.

We hope you enjoyed this quick excursion into the world of recursive CTEs and n-Tuples.  We know we most certainly did.  Perhaps even one day it will be as useful for you as it was for me.

Now if I can only figure out how these pesky recursive CTEs really work, then I could post something really interesting!

Total article views: 8137 | Views in the last 30 days: 7

Related Articles
FORUM

### MDX query using ORDER and UNION functions

Having trouble using the ORDER function in an MDX query that uses a UNION

FORUM

### How to use"Union ALL" to join sql queries each containing order by clause

How to use"Union ALL" to join sql queries each containing order by clause

FORUM

SQL UNION

FORUM

### order by inside union (selects)

I have a series of select statements that are unioned togther. 1 of them I would like to be able to...

FORUM

### Selecting a Value of the Order Within a Group

Select To Indicate Order Number Within a Group

Tags
 permutation sql n-tuples t-sql

## Join the most active online SQL Server Community

### SQL knowledge, delivered daily, free:

#### You make SSC a better place

As a member of SQLServerCentral, you get free access to loads of fresh content: thousands of articles and SQL scripts, a library of free eBooks, a weekly database news roundup, a great Q & A platformâ€¦ And itâ€™s our huge, buzzing community of SQL Server Professionals that makes it such a success.