Need help with simple ETL strategy with SSIS

  • I have the following ETL scenario and I need some suggestions on how to go about it. I have a source table Tbl_S and a target table Tbl_T. There is a table Tbl_ETL that has two columns - Col_S and Co_T, ie source and target respectively. This table tells me which source column is "associated" with which target column, that is Col_S contains the names of all columns of Tbl_S and Col_T contains the names of all columns of Tbl_T.

    Example -

    Tbl_S -

    Col1S,Col2S,Col3S...etc.

    Some rows here. Dont know what the data in those rows

    represents. eg.Col1S has 11092011. Is that date or

    some ID ? don't know.

    Tbl_T

    ID,Date,Date_Updated...etc

    no data here !

    So, ETL table is like -

    Tbl_S, Tbl_T

    Col1s, Date

    Col2s, Date_Updated

    Col3s, ID

    ...etc

    Now the problem is that tomorrow, the "associations" could change ! Col1S could mean ID instead of Date. YES ! this is a very weird situation and I have to to live with it. Also, the datatypes of source and target columns are not the same, ie not all varchar or all date.

    Two questions -

    1 - Is the problem clear or have i missed something ?

    2 - Any strategy on how this could be done using only SSIS ?

  • I don't use SSIS, so no help there. I just thought I'd mention that this would be incredibly easy and simple to do with a little Dynamic SQL.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • In contrast with Jeff, I know (quite) a lot about SSIS.

    DO NOT USE SSIS FOR THIS.

    Use dynamic SQL for this (as Jeff mentioned).

    If I would need to do this in SSIS, I would do it with dynamic SQL in an Execute SQL Task.

    (alternatively, you could also use the .NET SQLBulkCopy class in a Script Task)

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Koen Verbeeck (10/20/2013)


    In contrast with Jeff, I know (quite) a lot about SSIS.

    DO NOT USE SSIS FOR THIS.

    Use dynamic SQL for this (as Jeff mentioned).

    If I would need to do this in SSIS, I would do it with dynamic SQL in an Execute SQL Task.

    (alternatively, you could also use the .NET SQLBulkCopy class in a Script Task)

    Yes, thats what I am doing now. Dynamic SQL inside SSIS. So time consuming !

  • blasto_max (10/20/2013)


    Koen Verbeeck (10/20/2013)


    In contrast with Jeff, I know (quite) a lot about SSIS.

    DO NOT USE SSIS FOR THIS.

    Use dynamic SQL for this (as Jeff mentioned).

    If I would need to do this in SSIS, I would do it with dynamic SQL in an Execute SQL Task.

    (alternatively, you could also use the .NET SQLBulkCopy class in a Script Task)

    Yes, thats what I am doing now. Dynamic SQL inside SSIS. So time consuming !

    "Time consuming" how?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • You can do this in SSIS (I am about to emark on a DW build that will use this principle!)

    You can use the unpivot task (it is a little confusing at first but there are plenty of good examples if you google it)

    Some things to consider

    You need a common anchor (e.g. source record ID)

    You can only unpivot columns of the same type

    With strings (DT_STR) they all need to be the same length to unpivot them - Grrrrrr!

    So;

    Change the output lengths of the strings at the top of the stream

    Multicast the stream; one for each data type

    Generate an unpivot for each data type that you need.

    ->Passthrough the anchor values only

    ->unpivot the columns into with a target column of FieldName

    ->Leave the unpivot value as the incoming field name

    Union the streams: you will have to put each datatype into a different column in the unioned stream

    This will change the upstream dataset

    ID,strString,intValue,dteDate

    1,Bob,123,01/07/2013

    2,Jim,999,31/08/2013

    into

    1,strString,Bob

    1,intValue,,123

    1,dteDate,,01/07/2013

    2,strString,Jim

    2,intValue,,999

    2,dteDate,,,31/08/2013

    You can now use lookup to translate the inbound column names into outbound column names and then use the unpivot function to rebuild the data stream.

    I'm not saying it will be quick mind....

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply