Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««123»»

SSIS : Insert/update source to destination table Expand / Collapse
Author
Message
Posted Friday, April 23, 2010 6:24 AM
Right there with Babe

Right there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with Babe

Group: General Forum Members
Last Login: Today @ 6:37 PM
Points: 795, Visits: 2,450
I agree with ashokdasari, the MERGE is what I would do in this particular situation. I actually use the MERGE to handle TYPE II SCD in my data warehouse scenario. Very quick and powerful.
Post #909377
Posted Friday, April 23, 2010 6:35 AM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Yesterday @ 8:15 AM
Points: 13,017, Visits: 10,800
Brandon Carl Goodman (4/23/2010)
I agree with ashokdasari, the MERGE is what I would do in this particular situation. I actually use the MERGE to handle TYPE II SCD in my data warehouse scenario. Very quick and powerful.


Unfortunately not everyone has SQL Server 2008 or later
Those people, including me, are stuck with creating the SCD ourselfves.

@ lbouaziz: you can achieve your proposed solution more easily with the SCD wizard. (Even better, use the SCD Kimball wizard from codeplex.) The only problem with your solution is that every update row gets issued against the database in a seperate query. Not really smart performance wise. What if you have to update one million records? One million different transactions against your database?

(and for the critics: yeah yeah, one million updates against a dimension is very unlikely. But what if it is an initial load and for some reason you have to run the package again? All updates...)




How to post forum questions.
Need an answer? No, you need a question.
What’s the deal with Excel & SSIS?

Member of LinkedIn. My blog at LessThanDot.

MCSA SQL Server 2012 - MCSE Business Intelligence
Post #909390
Posted Friday, April 23, 2010 6:42 AM
Right there with Babe

Right there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with Babe

Group: General Forum Members
Last Login: Today @ 6:37 PM
Points: 795, Visits: 2,450
I apologize for my oversight da-zero. I agree with you a 100% with the functionality of components for the SCD. How about using a checksum or even hashbyte value to determine a record condition? Similar to that of an incremental load.
Post #909396
Posted Friday, April 23, 2010 6:55 AM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Yesterday @ 8:15 AM
Points: 13,017, Visits: 10,800
Brandon Carl Goodman (4/23/2010)
I apologize for my oversight da-zero. I agree with you a 100% with the functionality of components for the SCD. How about using a checksum or even hashbyte value to determine a record condition? Similar to that of an incremental load.


No need for apologies

Checksum or hashbytes? Hmm, it depends
It depends on what you are trying to do. If you only want to update the field that has changed, than a checksum is useless because you won't know which column has changed. If you update an entire row, then why not? But it comes with the overhead of calculating the checksums everytime.

What I usually do is a left outer join between staging table and DWH table on the business key. If the surrogate key of the DWH is null, than it is an insert, otherwise it is an update. I write the inserts to an OLE DB Destination with fast load and the updates to a temp table. Then I do a massive single update against the DWH table. Since you can use the surrogate key to find the records to update, it goes really fast. (especially if you have an index on that surrogate key). This set-up works quite well for me
However, in SQL 2008 it is better and simpler to use the MERGE statement, as you already indicated.




How to post forum questions.
Need an answer? No, you need a question.
What’s the deal with Excel & SSIS?

Member of LinkedIn. My blog at LessThanDot.

MCSA SQL Server 2012 - MCSE Business Intelligence
Post #909405
Posted Friday, April 23, 2010 7:06 AM
Right there with Babe

Right there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with Babe

Group: General Forum Members
Last Login: Today @ 6:37 PM
Points: 795, Visits: 2,450
Very nice. I have had to implement the check sum before. What I did was build a checksum value over the values of columns (in this case all of them) and performed and insert, update, delete based upon that value. I like your setup in your last post. However, I believe that we may be getting slightly off base of helping a fellow with his issue. Pleasure discussing strategy with you.
Post #909414
Posted Friday, April 23, 2010 7:15 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Friday, September 9, 2011 2:30 AM
Points: 4, Visits: 20
i'm agree for the SCD wizard, but i thought u wanted something different.

i have a real pleasure to read all that post on this subject :)

I didn't use the codeplex one, and the merge option as well, and actually for me, dimension should not contains millions rows :)

but i accept the fact that sometime, we don't have choices.


Post #909425
Posted Friday, April 23, 2010 7:20 AM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Yesterday @ 8:15 AM
Points: 13,017, Visits: 10,800
Brandon Carl Goodman (4/23/2010)
Very nice. I have had to implement the check sum before. What I did was build a checksum value over the values of columns (in this case all of them) and performed and insert, update, delete based upon that value. I like your setup in your last post. However, I believe that we may be getting slightly off base of helping a fellow with his issue. Pleasure discussing strategy with you.


It's my pleasure as well

Off topic, off topic... The tile is SSIS: Insert/Update soruce to destination.
Exactly what we're discussing

How did you implement the checksum, I've never actually done it before.




How to post forum questions.
Need an answer? No, you need a question.
What’s the deal with Excel & SSIS?

Member of LinkedIn. My blog at LessThanDot.

MCSA SQL Server 2012 - MCSE Business Intelligence
Post #909431
Posted Friday, April 23, 2010 7:35 AM
Right there with Babe

Right there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with Babe

Group: General Forum Members
Last Login: Today @ 6:37 PM
Points: 795, Visits: 2,450
Here is a link for Phil Brammer article on using the checksum component for ssis.

http://www.ssistalk.com/2007/03/09/ssis-using-a-checksum-to-determine-if-a-row-has-changed/

When I implemented the checksum, I did all of it from T-SQL. Allow me to explain -

1. I have a base and a staging table identical in schema. Data from my import is loaded into my staging table. On the staging table I have added an additional column (checksum value) that is a computed column contain a checksum over all of the columns of the staging table, with the exception of a few columns. As data is loaded into staging table, checksum value is computed on the fly.

2. Once data has been loaded into the staging table, I perform a comparison between the staging table and base table.

a. Delete - delete from base table where business key not exist in staging table.

b. Update - create a checksum value over the columns of base table and compare that checksum value and business key against the checksum value and the business key of the staging table. If not equal update, else ignore.

c. Insert - where business key not exist in base table.

If you like I could create a mockup of the syntax for you if that would make more sense. I am more of a visual person my self so sometimes this helps. A picture is worth a thousand words.
Post #909450
Posted Friday, April 23, 2010 7:44 AM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Yesterday @ 8:15 AM
Points: 13,017, Visits: 10,800
Thanks for the link, very informative. However, at my current project I'm not allowed to install custom components, so yeah

Correct me if I'm wrong, but to me it seems that the set-up that you outlined just makes a perfect copy of the staging table in the base table. (rows that are not in the staging table are deleted, rows that are not in the base table are inserted, changed rows are updated. Hence, you end up with the staging table).

Isn't it simpler to just truncate the base table, perform a load of the staging table and re-build indexes?
(or did I miss something. That's possible, it's friday )




How to post forum questions.
Need an answer? No, you need a question.
What’s the deal with Excel & SSIS?

Member of LinkedIn. My blog at LessThanDot.

MCSA SQL Server 2012 - MCSE Business Intelligence
Post #909469
Posted Friday, April 23, 2010 7:48 AM
Right there with Babe

Right there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with Babe

Group: General Forum Members
Last Login: Today @ 6:37 PM
Points: 795, Visits: 2,450
Business requirements state that I am not allowed to truncate the base table as it needs to available at all times. I will work on a mockup for you and post it.
Post #909477
« Prev Topic | Next Topic »

Add to briefcase ««123»»

Permissions Expand / Collapse