SQL query to display the count of combinations

  • Hi Guys,

    I need huge help to build this Logic

    Sample Table:

    ID Process

    12 ER

    12 TE

    12 YT

    13 ER

    14 WR

    15 ER

    15 YT

    15 TE

    16 WR

    the goal is to show all the process combinations and how many ID has the same combination(Count of ID)

    Process....CombID.........NoOfMatch

    ER..............1.....................2

    TE..............1

    YT..............1

    ER..............2.....................1

    WR.............3.....................2

    Thanks in advance.

  • Here's the process I went through:

    * For each ID, Generate the combinations as a separate field ('ER,TE,YT'), using the XML string concatenation trick

    * For each of those combinations, figure out the minimum ID which has those as combinations

    * Calculate the ordering number using the DENSE_RANK for those IDs

    * Count the distinct IDs within each DENSE_RANK value Distinct ID count

    * SELECT the DISTINCT Process, Combination ID and Distinct ID count ("NoOfMatch")

    What? You want code too?

    OK.

    The first chunk (before the "WITH") is putting your data into a table variable so that we've got something to play with.

    The rest of it is the bit that actually does the work. Each SELECT statement represents one of the steps in the list above.

    USE tempdb;

    DECLARE @Process TABLE

    (

    ID INTEGER ,

    Process CHAR(2)

    );

    INSERT INTO @Process

    VALUES ( 12, 'ER' ),

    ( 12, 'TE' ),

    ( 12, 'YT' ),

    ( 13, 'ER' ),

    ( 14, 'WR' ),

    ( 15, 'ER' ),

    ( 15, 'YT' ),

    ( 15, 'TE' ),

    ( 16, 'WR' );

    WITH Combos

    AS ( SELECT p.ID ,

    p.Process ,

    ( SELECT STUFF(( SELECT ',' + pInner.Process

    FROM @Process pInner

    WHERE pInner.ID = p.ID

    ORDER BY pInner.Process

    FOR

    XML PATH('')

    ), 1, 1, '')

    ) AS Combo

    FROM @Process p

    ),

    MinComboIDs

    AS ( SELECT Combos.ID ,

    Combos.Process ,

    Combos.Combo ,

    MIN(Combos.ID) OVER ( PARTITION BY Combos.Combo ) AS MinComboID

    FROM Combos

    ),

    GotComboIDs

    AS ( SELECT MinComboIDs.ID ,

    MinComboIDs.Process ,

    MinComboIDs.Combo ,

    MinComboIDs.MinComboID ,

    DENSE_RANK() OVER ( ORDER BY MinComboIDs.MinComboID ) AS OutputComboID

    FROM MinComboIDs

    ),

    Matches

    AS ( SELECT gci.ID ,

    gci.Process ,

    gci.Combo ,

    gci.MinComboID ,

    gci.OutputComboID ,

    ( SELECT COUNT(DISTINCT gci2.ID)

    FROM GotComboIDs gci2

    WHERE gci2.OutputComboID = gci.OutputComboID

    ) AS NoOfMatch

    FROM GotComboIDs gci

    )

    SELECT DISTINCT

    Process ,

    Matches.OutputComboID ,

    Matches.NoOfMatch

    FROM Matches

    ORDER BY Matches.OutputComboID;

    Thomas Rushton
    blog: https://thelonedba.wordpress.com

  • It's not clear how these results derive from your sample data: specifically, it's not clear why there are two rows for ER instead of 1, and it's not clear where the 3 came from for CombID in the WR row.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply