UPSERT USING SSIS 2005

  • While building the datawarehouse solution for my company, I have frequently come across the situation where I have to update and Insert data from our ODS to the Datawarehouse. Even though we can achieve this via Slowly Changing Dimensions Task or Custom Scripts, attached is a simple tutorial that outlines the procedures that can also be followed to implement this

    Have fun

  • Why not submit this to SSC as an article? The forums are usually used for questions.

  • Forums are fine for whatever, but I second the idea of submitting it as an article.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • For what it's worth, I third the motion to see this in an article. Have you done any time/performance comparisons on this method versus the SCD transformation?

    John Rowan

    ======================================================
    ======================================================
    Forum Etiquette: How to post data/code on a forum to get the best help[/url] - by Jeff Moden

  • Yep! Fourth here! Add a bit of a write up to go with all the great pictures and submit that bad boy as an article. I'm sure that it'll become one of those articles that all the big dogs direct folks to. Nicely done.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • How are you handling when more than one column's value is updated?

    John Rowan

    ======================================================
    ======================================================
    Forum Etiquette: How to post data/code on a forum to get the best help[/url] - by Jeff Moden

  • John Rowan (8/5/2009)


    How are you handling when more than one column's value is updated?

    I am interested in this as well...I import files where potentially any one (or many) of 40 + columns may have changed....I have no prior knowledge of which column(s).

    many thanks Graham

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • Hey there Ritz... I sure do hope you're still around because I need some help that your very good Word document just isn't helping me with. Would you take a look at the following post of mine and see if you can lend a hand? I sure would appreciate it. Thanks

    http://www.sqlservercentral.com/Forums/Topic858273-148-1.aspx

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • For those on SSIS 2008: you might consider using the MERGE statement offering you less work and better performance.

    Basically, your package will look like this:

    Data Flow Task: Read Data From Source and write it to staging table.

    Execute SQL Task: MERGE INTO final table USING staging table.

    As a side note: beware of Columns that can be NULL when comparing them in the Conditional Split!

    Regards,

    Willem
    http://wschampheleer.wordpress.com[/url]

  • I am using Slowly changing dimension wizard to load my staging database on a daily basis...the prb is while im running my packg with large tables(like 100 thousand records) it takes lot of time ...is there any way to speed up pakg...plz help....

  • This is a known problem with the SCD transform.

    Your alternatives are:

    - use the Kimball Method SSIS Slowly Changing Dimension Component[/url]

    - instead of doing the updates via OLE DB CMD, use an OLE DB Destination to write to an intermediate table and do the update using an Execute SQL Task in the Control Flow (set based instead of RBAR, well explained in this book).

    - do not use a wizard but use lookups and conditional splits (also explained in the above book)

    - when using SSIS 2008: use the merge statement (my preferred approach) - here is a nice (but unfortunately rather incomplete) article to get you started

    Regards,

    Willem
    http://wschampheleer.wordpress.com[/url]

  • I have to ask this question:

    In poorly designed tables that have 400+ columns (but luckily do have a unique guid as a PK), wouldn't it just be easier to check to see if the uid exists in the table, if it does DELETE it, if not INSERT it?

    Rather than having to perform and update on various columns (i.e. 1 col - 400+ cols)???

    ______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience

  • Instead of using the conditional split to explicitly determining the columns that have been changed, you can use a SQL Command with an UPDATE statement to the entire table

    i.e.

    UPDATE TableName

    SET

    Column1 = ?,

    Column2 = ?,

    Column3 = ?,

    Column4 = ?,

    Column5 = ?

    ...etc

    WHERE ID = ?

    And complete the mapping

    This will take care of any updates to the columns

  • MyDoggieJessie (4/11/2011)


    I have to ask this question:

    In poorly designed tables that have 400+ columns (but luckily do have a unique guid as a PK), wouldn't it just be easier to check to see if the uid exists in the table, if it does DELETE it, if not INSERT it?

    Rather than having to perform and update on various columns (i.e. 1 col - 400+ cols)???

    I didn't know that was an option. My answer would be "Yes, most definitely".

    Just be aware that ALL DELETE and INSERT triggers will fire and ALL indexes will be updated twice.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply