Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 

The Voice of the DBA

Steve Jones is the editor of SQLServerCentral.com and visits a wide variety of data related topics in his daily editorial. Steve has spent years working as a DBA and general purpose Windows administrator, primarily working with SQL Server since it was ported from Sybase in 1990. You can follow Steve on Twitter at twitter.com/way0utwest

Combinations and Permutations

I ran into an interesting question from someone asking for all combinations of numbers. The thread was here, and it was confusing at first. However we started to understand what was being asked and eventually the person stopped, realizing that there were too many combinations.

However at one point there was this quote: I’m looking for combinations, not permutations since the order is unimportant.

That surprised me since I was thinking of things in the opposite manner. However when I read this article, it made sense: combinations and permutations

There is a table in the article that cleared it up. I’ve reproduced it here:

Order does matter Order doesn’t matter

1 2 3

 

1 3 2

 

2 1 3

1 2 3

2 3 1

 

3 1 2

 

3 2 1

The left hand side is permutations, where order matters. The right hand side is combinations, of which there is only one. Order doesn’t matter so 1-2-3 is the same as 1-3-2, which is the same as 3-2-1 and all other orderings.

This shows mathematical semantics, but those are important in SQL because we do need to talk about combinations and permutations. A CROSS JOIN is normally how we handle combinations, which for three numbers are usually thought of as the way to handle all combinations, but I’m not sure how that works here.

If I create a quick table:

CREATE TABLE Combinations
( id int) GO INSERT Combinations select 1
INSERT Combinations select 2
INSERT Combinations select 3

and then perform a cross join:

SELECT a.id, b.id
 FROM Combinations a
   CROSS JOIN dbo.Combinations b

I get this:

id          id

———– ———–

1           1

2           1

3           1

1           2

2           2

3           2

1           3

2           3

3           3

Not all combinations. Even if I add in a third result:

SELECT a.id, b.id, c.id
 FROM Combinations a
   CROSS JOIN dbo.Combinations b
   CROSS JOIN dbo.Combinations c

id          id          id

———– ———– ———–

1           1           1

1           2           1

1           3           1

1           1           2

1           2           2

1           3           2

1           1           3

1           2           3

1           3           3

2           1           1

2           2           1

2           3           1

2           1           2

2           2           2

2           3           2

2           1           3

2           2           3

2           3           3

3           1           1

3           2           1

3           3           1

3           1           2

3           2           2

3           3           2

3           1           3

3           2           3

3           3           3

However, what about if I limit the result to remove duplicates:

SELECT a.id, b.id, c.id
 FROM Combinations a
   CROSS JOIN dbo.Combinations b
   CROSS JOIN dbo.Combinations c
WHERE a.id != b.id
AND b.id != c.id
AND a.id != c.id

That seems to work better:

id          id          id

———– ———– ———–

1           3           2

1           2           3

2           3           1

2           1           3

3           2           1

3           1           2

However that’s not a great solution. I’ve hardcoded the number of items, and it’s a cumbersome query. I think there has to be a better solution, but it’s probably one that’s beyond my T-SQL skills. If I look at 5 numbers, I get this:

SELECT a.id 'a', b.id 'b', c.id 'c', d.id 'd', e.id 'e' FROM Combinations a
   CROSS JOIN dbo.Combinations b
   CROSS JOIN dbo.Combinations c
   CROSS JOIN dbo.Combinations d
   CROSS JOIN dbo.Combinations e
WHERE a.id != b.id
AND a.id != c.id
AND a.id != d.id
AND a.id != e.id
AND b.id != c.id
AND b.id != d.id
AND b.id != e.id
AND c.id != d.id
AND c.id != e.id
AND d.id != e.id
ORDER BY a, b, c, d

I get 120 rows. If I look at 5!, that’s 120, so I think I’m correct. I’m not duplicating all the results, nor am I  going to look through them all, but they seem correct and ordered.

The thing this solution leaves out is the combinations that are less than the total number of items, as asked in the thread. So expanded combinations would be:

1

2

3

4

5

1 – 2

1 – 3

1 – 4

1 – 4

1 – 2 – 3

Not worth including those, but I think that would result in a series of UNION queries, which would get really ugly.

I’ll ask around, but if anyone has an interesting way to solve this that’s cleaner, I’d be interested.

Update

Someone posted this, which seems to work wonderfully:

DECLARE @s VARCHAR(25)
,@Iteration Int
SET @s = ‘ABC’;
SET @Iteration = LEN(@s);

WITH E1(N) AS ( –=== Create Ten 1′s
SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 –10
),
cteTally(N) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT N)) FROM E1
),CteCombos AS (
SELECT CAST(SUBSTRING(@s, N, 1) AS VARCHAR(25)) AS Token,
CAST(‘.’+CAST(N AS CHAR(1))+’.’ AS VARCHAR(52)) AS Permutation,
CAST(1 AS INT) AS Iteration
FROM cteTally WHERE N <= @Iteration
UNION ALL
SELECT CAST(Token+SUBSTRING(@s, N, 1) AS VARCHAR(25)) AS Token,
CAST(Permutation+CAST(N AS CHAR(1))+’.’ AS VARCHAR(52)) AS
Permutation,
s.Iteration + 1 AS Iteration
FROM CteCombos s
INNER JOIN cteTally n
ON s.Permutation NOT LIKE ‘%.’+CAST(N AS CHAR(1))+’.%’
AND s.Iteration < @Iteration
AND N <= @Iteration
)
SELECT Token,Permutation,Iteration
FROM CteCombos
WHERE Iteration = @Iteration
ORDER BY Permutation

If you alter the variable (abc) to the number of items you need, so “abcde” for 5, this returns the combinations.


Filed under: Blog Tagged: syndicated, T-SQL

Comments

Posted by Jason Brimhall on 8 June 2011

As an update to that script, it currently only works up to 9 characters in length.  I can't remember if I posted a blog on it, but it can be slightly modified to support 10+ characters through a few simple changes.  e.g. char(1) to char(2).

Posted by Joe Celko on 13 June 2011

Please stop using !=; that is needless dialect. Here is a clean way to do this. Take a subset from your Series table of numbers. Use the NOT IN () predicate which is surprisingly fast and easy to write with a little cut&paste.

CREATE TABLE Series (seq SMALLINT NOT NULL PRIMARY KEY);

INSERT INTO Series VALUES   (1), (2), (3), (4), (5), (6), (7), (8), (9), (10);

WITH Perm(i)

AS

(SELECT seq

  FROM Series

WHERE seq <= 5)

SELECT P1.i AS p1, P2.i AS p2, P3.i AS p3, P4.i AS p4, P5.i AS p5

 FROM Perm AS P1

      CROSS JOIN Perm AS P1

      CROSS JOIN Perm AS P1

      CROSS JOIN Perm AS P1

     CROSS JOIN Perm AS P1

WHERE P1.i NOT IN (P2.i, P3.i, P4.i, P5.i)

 AND P2.i NOT IN (P3.i, P4.i, P5.i)

 AND P3.i NOT IN (P4.i, P5.i)

 AND P4.i NOT IN (P5.i); -– just be consistent

Combinations are a bit uglier but still a cut&paste job . Here is 5 out of 10 items

WITH Comb(i)

AS

(SELECT seq

  FROM Series

 WHERE seq <= 10)

SELECT C1.i AS c1, C2.i AS c2, C3.i AS c3, C4.i AS c4, C5.i AS c5

 FROM Comb AS C1

      CROSS JOIN Comb AS C2

      CROSS JOIN Comb AS C3

      CROSS JOIN Comb AS C4

      CROSS JOIN Comb AS C5

WHERE C1.i < ALL (SELECT i FROM (VALUES (C2.i), (C3.i), (C4.i), (C5.i)) AS X(i))

 AND C2.i < ALL (SELECT i FROM (VALUES (C3.i), (C4.i), (C5.i)) AS X(i))

 AND C3.i < ALL (SELECT i FROM (VALUES (C4.i), (C5.i)) AS X(i))

 AND C4.i < C5.i;

Posted by Joe Celko on 13 June 2011

Please stop using !=; that is needless dialect. Here is a clean way to do this. Take a subset from your Series table of numbers. Use the NOT IN () predicate which is surprisingly fast and easy to write with a little cut&paste.

CREATE TABLE Series (seq SMALLINT NOT NULL PRIMARY KEY);

INSERT INTO Series VALUES   (1), (2), (3), (4), (5), (6), (7), (8), (9), (10);

WITH Perm(i)

AS

(SELECT seq

  FROM Series

WHERE seq <= 5)

SELECT P1.i AS p1, P2.i AS p2, P3.i AS p3, P4.i AS p4, P5.i AS p5

 FROM Perm AS P1

      CROSS JOIN Perm AS P1

      CROSS JOIN Perm AS P1

      CROSS JOIN Perm AS P1

     CROSS JOIN Perm AS P1

WHERE P1.i NOT IN (P2.i, P3.i, P4.i, P5.i)

 AND P2.i NOT IN (P3.i, P4.i, P5.i)

 AND P3.i NOT IN (P4.i, P5.i)

 AND P4.i NOT IN (P5.i); -– just be consistent

Combinations are a bit uglier but still a cut&paste job . Here is 5 out of 10 items

WITH Comb(i)

AS

(SELECT seq

  FROM Series

 WHERE seq <= 10)

SELECT C1.i AS c1, C2.i AS c2, C3.i AS c3, C4.i AS c4, C5.i AS c5

 FROM Comb AS C1

      CROSS JOIN Comb AS C2

      CROSS JOIN Comb AS C3

      CROSS JOIN Comb AS C4

      CROSS JOIN Comb AS C5

WHERE C1.i < ALL (SELECT i FROM (VALUES (C2.i), (C3.i), (C4.i), (C5.i)) AS X(i))

 AND C2.i < ALL (SELECT i FROM (VALUES (C3.i), (C4.i), (C5.i)) AS X(i))

 AND C3.i < ALL (SELECT i FROM (VALUES (C4.i), (C5.i)) AS X(i))

 AND C4.i < C5.i;

Leave a Comment

Please register or log in to leave a comment.