Combinations, Left Joins, and Cartesian Products... what works?

  • Suppose there's a 3 number lottery that pays out with 1, 2, and 3 number matches. For simplicity's sake, suppose that there are only 3 balls to draw from which are drawn without replacement. I want to create a list of possible winning combinations.

    I can do this creating cartesian products and using Union statements. I'd like to simplify it using left joins, but it isn't working. Can anybody explain why? Is there a better way (cross joins perhaps)? Here's the code followed by the query results...

    CREATE TABLE #Numbers (Number int)

    Insert into #Numbers (Number) values(1)

    Insert into #Numbers (Number) values(2)

    Insert into #Numbers (Number) values(3)

    --Approach 1, this correctly returns 7 rows.

    SELECT

    Na.Number [Ball 1], NULL [Ball 2], NULL [Ball 3]

    FROM #Numbers Na

    UNION ALL

    SELECT

    Na.Number, Nb.Number, NULL

    FROM #Numbers Na, #Numbers Nb

    WHERE

    Na.Number < Nb.Number

    UNION ALL

    SELECT

    Na.Number, Nb.Number, Nc.Number

    FROM #Numbers Na, #Numbers Nb, #Numbers Nc

    WHERE

    Na.Number < Nb.Number

    AND

    Nb.Number < Nc.Number

    --Approach 2, this returns only 4 rows.

    SELECT Na.Number [Ball 1], Nb.Number [Ball 2], Nc.Number [Ball 3]

    FROM #Numbers Na

    LEFT JOIN #Numbers Nb

    ON Na.Number < Nb.Number

    LEFT JOIN #Numbers Nc

    ON Nb.Number < Nc.Number

    DROP TABLE #Numbers

    The results

    This is what I want

    Ball 1 Ball 2 Ball 3

    ------------------------

    1 NULL NULL

    2 NULL NULL

    3 NULL NULL

    1 2 NULL

    1 3 NULL

    2 3 NULL

    1 2 3

    (7 row(s) affected)

    This is missing some rows for some reason

    Ball 1 Ball 2 Ball 3

    -----------------------

    1 2 3

    1 3 NULL

    2 3 NULL

    3 NULL NULL

    (4 row(s) affected)

    So why doesn't the second query return 1 Null Null, 2 Null Null, and 1 2 Null?

    Thanks

    Stephen Sjodin

  • Reason: By including NULLS in your SELECTS, you are sort of adding extra NULL values to the sets Nb and Nc...

    You can achieve the same by doing this:

    SELECT Na.Number [Ball 1], Nb.Number [Ball 2], Nc.Number [Ball 3]

    FROM #Numbers Na

    LEFT JOIN (SELECT Number FROM #Numbers UNION ALL SELECT NULL) Nb

    ON Na.Number < ISNULL(Nb.Number,Na.Number+1)

    LEFT JOIN (SELECT Number FROM #Numbers UNION ALL SELECT NULL) Nc

    ON Nb.Number < ISNULL(Nc.Number,Nb.Number+1)

    Here, I am adding the extra NULLS into the Nb and Nc sets to get the desired result.

    This is absolutely NOT scalable - don't even think of going above a 100 or so "Balls" unless you have a lot of patience and ram.

    Also, this is not faster or better than the UNION solution - just wanted to show WHY...and HOW...

    MM



    select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);

  • Forum Etiquette: How to post Reporting Services problems
  • [/url]
  • Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
  • [/url]
  • How to Post Performance Problems - by Gail Shaw
  • [/url]

  • Thanks Mr. Magoo.

    I think the problem I'm seeing in approach 2 is due to the left joins' "on" statements. For example, when the filter Na.Number < Nb.Number can be satisfied in more than one condition it only returns the last of the possible Na.Number < Nb.Number results. This can be more clearly seen if you add another ball into the code (Insert into #Numbers (Number) values(4)).

    1 2 NULL

    1 3 NULL

    1 4 NULL -- of the combinations starting with "1", only this row is returned.

    I guess when the joins are testing for Nx.Number<Ny.Number and Ny.Number doesn't exist, something is throwing a wrench into the gears.

    I can see how your solution works - it explicitly introduces a record with a null value to test against rather than letting SQL figure it out. I've achieved the same results by inserting a record with a Null value for the number and changing my code as follows:

    Insert into #Numbers (Number) values(NULL) -- a null record to filter on

    SELECT Na.Number [Ball 1], Nb.Number [Ball 2], Nc.Number [Ball 3]

    FROM #Numbers Na

    LEFT JOIN #Numbers Nb

    ON Na.Number < Nb.Number OR Nb.Number IS NULL

    LEFT JOIN #Numbers Nc

    ON Nb.Number < Nc.Number OR (Nc.Number IS NULL AND Nb.Number IS NOT NULL)

    WHERE Na.Number IS NOT NULL

    ORDER BY Nc.Number, Nb.Number, Na.Number --remove when comparing with Mr. Magoo's code

    I would note however that according to the execution metrics in SQL, your query is considerably faster than mine, even when I use your join construct Nx.Number < ISNULL(Ny.Number, Nx.Number+1). It seems adding the null value inline the way you did with the subquery is better than adding an actual null value to the #Numbers table. I don't know why, but clearly you know your stuff.

    Anyway, if the issue is indeed about testing records with values against records with Null values versus testing records with values against records that don't exist at all, is there some provision in TSQL for the latter?

  • Viewing 3 posts - 1 through 2 (of 2 total)

    You must be logged in to reply to this topic. Login to reply