Is this type of join possible with SSIS?

  • Hi!

    I'm trying to get a BI project into SSIS and I'm stucked.

    We have a table with all the People ID's, call them ID_PPL

    Then I have several transactions, and they are separated by a code (1, 2, 3) but a people can have 3 codes, 2 codes or one, or nothing. Then, each people can make a lot of transactions into one code and all that info is processed and grouped by them unique ID.

    I.E:

    My ID table is something like this (very simplified)

    ID - NAME

    1111 - John

    1112 - Paul

    1113 - Eve

    1114 - Peter

    1115 - Bill

    TABLE COD3

    ID - NAME - CODE - TRX

    1111 - John - Cod 3 - $1000

    1111 - John - Cod 3 - $500

    1112 - Paul - Cod 3 - $300

    TABLE COD2

    1111 - John - Cod 2 - $100

    TABLE COD1

    1113 - Eve - Cod 1 - $5000

    Then i process into this (lets sum the amounts of each code, and a total) so the resultset would be like this:

    ID - NAME - TOTALSUM - TOTALCOD1 - TOTALCOD2 - TOTALCOD3

    1111 - John - $1600 - NULL - $100 - $1500

    1112 - Paul - $300 - NULL - NULL - $300

    1113 - Eve - $5000 - $5000 - NULL - NULL

    1114 - Peter - NULL - NULL - NULL- NULL

    1115 - Bill - NULL - NULL - NULL - NULL

    That's the way, because i need to join every table, and keep every ID even if it's NULL

    So i tried a cascade and joined, losing all the people that ID(PK) doesn't match with other

    Then i tried with multicast from ID_PPL but if there is a person who don't have any SUM it can't join, and well, my boss hate me (i'm a trainee student with a lot of responsability now)

    I would appreciate if someone could show me a light, really.

    And thanks in advance (and excuse about my english, I'm from Chile)

  • If joins are the issue, you could try using a Merge Join as this supports Inner, left outer and fullouter joins.

    Steve.

  • mmm

  • stevefromOZ (2/12/2010)


    If joins are the issue, you could try using a Merge Join as this supports Inner, left outer and fullouter joins.

    Hi Steve! first of all thanks for your reply

    All of this -Joins- were builded with the Merge Join box so I don't know what can i do :smooooth:

  • Very roughly this is what I would try to do....

    First source is to read from the ID_PPL table, so stream of incoming data is simply

    ID Name

    1 Steve

    etc

    (it's arbitrary but) next source is Table CODE_3. Read in the fields

    ID, TRX

    and push them into an aggregation task (alternatively, simply aggregate in your SQL that sources the data)

    Now, with the ID_PPL being the 'left' stream of data, connect it and the output from the CODE_3 table in a merge join. Set it to LEFT OUTER join, so that *all* of the people come through, regardless of whether they have Code_3 sales or not.

    The output columns are ID, Name and Code_3_ttl.

    No read the Code_2, use the same theory, left outer join (where the prior output of the merge join is the 'left' and this new stream is the right), join on ID again.

    New output columns from this join will be ID, Name, Code_3_ttl, Code_2_ttl.

    Use the same pattern for joining in Code_1.

    The queries for each data source will be like...

    ID_PPL source --> Select ID, Name FROM ID_PPL

    Code_3 --> Select ID, SUM(TRX) AS code_3_ttl FROM code_3_table

    Code_2 --> Select ID, SUM(TRX) as code_2_ttl from code_2 _table

    and so on.

    HTH,

    Steve.

  • Hey!!

    Thanks! it worked fine

    I used all my ID data as left input using Left Outer Join, It doesn't lose any row! 😛

  • Or you can use the following SQL code in your OLE DB Source:

    SELECT

    t0.[ID] AS [ID]

    , [Name]

    , (isnull(TOTALCOD1,0) + isnull(TOTALCOD2,0) + isnull(TOTALCOD3,0)) AS TOTALSUM

    , TOTALCOD1

    , TOTALCOD2

    , TOTALCOD3

    FROM

    (SELECT [ID]

    ,[Name]

    FROM [ID_PPL]) t0

    LEFT OUTER JOIN

    (SELECT [ID]

    , SUM([TRX]) AS TOTALCOD1

    FROM [COD1]

    GROUP BY [ID]) t1

    ON t0.[ID] = t1.[ID]

    LEFT OUTER JOIN

    (SELECT [ID]

    , SUM([TRX]) AS TOTALCOD2

    FROM [COD2]

    GROUP BY [ID]) t2

    ON t0.[ID] = t2.[ID]

    LEFT OUTER JOIN

    (SELECT [ID]

    , SUM([TRX]) AS TOTALCOD3

    FROM [COD3]

    GROUP BY [ID]) t3

    ON t0.[ID] = t3.[ID]

    This is much more efficient, as you don't need multiple inputs and multiple merge joins. Don't forget your data has to be sorted in order to use merge join, which is a blocking transformation if you use the Sort component!

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply