best method to handle SCD

  • Hi All,

    We are using SCD types in dimension loading. For this we have planned to use default SCD component in SSIS. We also found diffrent methods to handle SCD like SCD wizard, Look/condtional split, T-sql Merge statement, Dimension Merge SCD component (from codeplex)

    http://dimensionmergescd.codeplex.com/

    Please suggest which method would be the best for handling SCD.

    Thanks in advance..

    Regards,

    Gugan

  • gugan_ta (11/4/2011)


    Hi All,

    We are using SCD types in dimension loading. For this we have planned to use default SCD component in SSIS. We also found diffrent methods to handle SCD like SCD wizard, Look/condtional split, T-sql Merge statement, Dimension Merge SCD component (from codeplex)

    http://dimensionmergescd.codeplex.com/

    Please suggest which method would be the best for handling SCD.

    I've never heard good things about the performance of the SCD component and have never used it other than for demonstrations; I'm not sure how much it is used in real-life production.

    Whether you use a Lookup/Conditional split or a T-SQL merge probably depends upon volume/performance. New records performance is fine as you just write them out to your OLE destination component. Updating existing records using an OLE DB Command is slow (since it's row by row), but is plenty fine if the volume is low. If the number of records that need updating is larger, writing to a staging table and then doing your INSERTS & UPDATES from a T-SQL command using MERGE will perform better.

    I can't comment on the component from codeplex as I've never used it.

    As with everything, test/validate which scenario performs better on your data & systems.

    HTH,

    Rob

  • gugan_ta (11/4/2011)


    Hi All,

    We are using SCD types in dimension loading. For this we have planned to use default SCD component in SSIS. We also found diffrent methods to handle SCD like SCD wizard, Look/condtional split, T-sql Merge statement, Dimension Merge SCD component (from codeplex)

    http://dimensionmergescd.codeplex.com/

    Please suggest which method would be the best for handling SCD.

    Thanks in advance..

    Regards,

    Gugan

    What TYPE of SCD are you trying to use?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • SCD 2 Type

  • best way to handle type 2 scd in my opinion, is to create a staging table that is identical to your load table and use the t-sql merge.

    this will be much quicker than using a component as it is set based, you can also use the output clause to expire the old rows.

  • For us, it was important that any change made by any method had to be detected and that the appropriate rows where enddated as part of the transaction that did the inserts. I know a lot of people scoff at the use of triggers but they have a "built in" OUTPUT clause in the form of the INSERTED tables formed. The triggers were, of course, written in a set-based fashion and were nasty fast.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply