SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


SSIS : Insert/update source to destination table


SSIS : Insert/update source to destination table

Author
Message
Brandon Carl Goodman
Brandon Carl Goodman
Ten Centuries
Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)

Group: General Forum Members
Points: 1144 Visits: 2731
I agree with ashokdasari, the MERGE is what I would do in this particular situation. I actually use the MERGE to handle TYPE II SCD in my data warehouse scenario. Very quick and powerful.
Koen Verbeeck
Koen Verbeeck
SSC Guru
SSC Guru (62K reputation)SSC Guru (62K reputation)SSC Guru (62K reputation)SSC Guru (62K reputation)SSC Guru (62K reputation)SSC Guru (62K reputation)SSC Guru (62K reputation)SSC Guru (62K reputation)

Group: General Forum Members
Points: 62386 Visits: 13298
Brandon Carl Goodman (4/23/2010)
I agree with ashokdasari, the MERGE is what I would do in this particular situation. I actually use the MERGE to handle TYPE II SCD in my data warehouse scenario. Very quick and powerful.


Unfortunately not everyone has SQL Server 2008 or later Sad
Those people, including me, are stuck with creating the SCD ourselfves.

@ lbouaziz: you can achieve your proposed solution more easily with the SCD wizard. (Even better, use the SCD Kimball wizard from codeplex.) The only problem with your solution is that every update row gets issued against the database in a seperate query. Not really smart performance wise. What if you have to update one million records? One million different transactions against your database?

(and for the critics: yeah yeah, one million updates against a dimension is very unlikely. But what if it is an initial load and for some reason you have to run the package again? All updates...)


How to post forum questions.
Need an answer? No, you need a question.
What’s the deal with Excel & SSIS?
My blog at SQLKover.

MCSE Business Intelligence - Microsoft Data Platform MVP
Brandon Carl Goodman
Brandon Carl Goodman
Ten Centuries
Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)

Group: General Forum Members
Points: 1144 Visits: 2731
I apologize for my oversight da-zero. I agree with you a 100% with the functionality of components for the SCD. How about using a checksum or even hashbyte value to determine a record condition? Similar to that of an incremental load.
Koen Verbeeck
Koen Verbeeck
SSC Guru
SSC Guru (62K reputation)SSC Guru (62K reputation)SSC Guru (62K reputation)SSC Guru (62K reputation)SSC Guru (62K reputation)SSC Guru (62K reputation)SSC Guru (62K reputation)SSC Guru (62K reputation)

Group: General Forum Members
Points: 62386 Visits: 13298
Brandon Carl Goodman (4/23/2010)
I apologize for my oversight da-zero. I agree with you a 100% with the functionality of components for the SCD. How about using a checksum or even hashbyte value to determine a record condition? Similar to that of an incremental load.


No need for apologies :-)

Checksum or hashbytes? Hmm, it depends :-)
It depends on what you are trying to do. If you only want to update the field that has changed, than a checksum is useless because you won't know which column has changed. If you update an entire row, then why not? But it comes with the overhead of calculating the checksums everytime.

What I usually do is a left outer join between staging table and DWH table on the business key. If the surrogate key of the DWH is null, than it is an insert, otherwise it is an update. I write the inserts to an OLE DB Destination with fast load and the updates to a temp table. Then I do a massive single update against the DWH table. Since you can use the surrogate key to find the records to update, it goes really fast. (especially if you have an index on that surrogate key). This set-up works quite well for me :-)
However, in SQL 2008 it is better and simpler to use the MERGE statement, as you already indicated.


How to post forum questions.
Need an answer? No, you need a question.
What’s the deal with Excel & SSIS?
My blog at SQLKover.

MCSE Business Intelligence - Microsoft Data Platform MVP
Brandon Carl Goodman
Brandon Carl Goodman
Ten Centuries
Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)

Group: General Forum Members
Points: 1144 Visits: 2731
Very nice. I have had to implement the check sum before. What I did was build a checksum value over the values of columns (in this case all of them) and performed and insert, update, delete based upon that value. I like your setup in your last post. However, I believe that we may be getting slightly off base of helping a fellow with his issue.:-) Pleasure discussing strategy with you.
lbouaziz
lbouaziz
Grasshopper
Grasshopper (24 reputation)Grasshopper (24 reputation)Grasshopper (24 reputation)Grasshopper (24 reputation)Grasshopper (24 reputation)Grasshopper (24 reputation)Grasshopper (24 reputation)Grasshopper (24 reputation)

Group: General Forum Members
Points: 24 Visits: 20
i'm agree for the SCD wizard, but i thought u wanted something different.

i have a real pleasure to read all that post on this subject Smile

I didn't use the codeplex one, and the merge option as well, and actually for me, dimension should not contains millions rows Smile

but i accept the fact that sometime, we don't have choices.

Hehe
Koen Verbeeck
Koen Verbeeck
SSC Guru
SSC Guru (62K reputation)SSC Guru (62K reputation)SSC Guru (62K reputation)SSC Guru (62K reputation)SSC Guru (62K reputation)SSC Guru (62K reputation)SSC Guru (62K reputation)SSC Guru (62K reputation)

Group: General Forum Members
Points: 62386 Visits: 13298
Brandon Carl Goodman (4/23/2010)
Very nice. I have had to implement the check sum before. What I did was build a checksum value over the values of columns (in this case all of them) and performed and insert, update, delete based upon that value. I like your setup in your last post. However, I believe that we may be getting slightly off base of helping a fellow with his issue.:-) Pleasure discussing strategy with you.


It's my pleasure as well :-)

Off topic, off topic... The tile is SSIS: Insert/Update soruce to destination.
Exactly what we're discussing :-D

How did you implement the checksum, I've never actually done it before.


How to post forum questions.
Need an answer? No, you need a question.
What’s the deal with Excel & SSIS?
My blog at SQLKover.

MCSE Business Intelligence - Microsoft Data Platform MVP
Brandon Carl Goodman
Brandon Carl Goodman
Ten Centuries
Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)

Group: General Forum Members
Points: 1144 Visits: 2731
Here is a link for Phil Brammer article on using the checksum component for ssis.

http://www.ssistalk.com/2007/03/09/ssis-using-a-checksum-to-determine-if-a-row-has-changed/

When I implemented the checksum, I did all of it from T-SQL. Allow me to explain -

1. I have a base and a staging table identical in schema. Data from my import is loaded into my staging table. On the staging table I have added an additional column (checksum value) that is a computed column contain a checksum over all of the columns of the staging table, with the exception of a few columns. As data is loaded into staging table, checksum value is computed on the fly.

2. Once data has been loaded into the staging table, I perform a comparison between the staging table and base table.

a. Delete - delete from base table where business key not exist in staging table.

b. Update - create a checksum value over the columns of base table and compare that checksum value and business key against the checksum value and the business key of the staging table. If not equal update, else ignore.

c. Insert - where business key not exist in base table.

If you like I could create a mockup of the syntax for you if that would make more sense. I am more of a visual person my self so sometimes this helps. A picture is worth a thousand words.
Koen Verbeeck
Koen Verbeeck
SSC Guru
SSC Guru (62K reputation)SSC Guru (62K reputation)SSC Guru (62K reputation)SSC Guru (62K reputation)SSC Guru (62K reputation)SSC Guru (62K reputation)SSC Guru (62K reputation)SSC Guru (62K reputation)

Group: General Forum Members
Points: 62386 Visits: 13298
Thanks for the link, very informative. However, at my current project I'm not allowed to install custom components, so yeah :-)

Correct me if I'm wrong, but to me it seems that the set-up that you outlined just makes a perfect copy of the staging table in the base table. (rows that are not in the staging table are deleted, rows that are not in the base table are inserted, changed rows are updated. Hence, you end up with the staging table).

Isn't it simpler to just truncate the base table, perform a load of the staging table and re-build indexes?
(or did I miss something. That's possible, it's friday :-))


How to post forum questions.
Need an answer? No, you need a question.
What’s the deal with Excel & SSIS?
My blog at SQLKover.

MCSE Business Intelligence - Microsoft Data Platform MVP
Brandon Carl Goodman
Brandon Carl Goodman
Ten Centuries
Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)

Group: General Forum Members
Points: 1144 Visits: 2731
Business requirements state that I am not allowed to truncate the base table as it needs to available at all times. I will work on a mockup for you and post it.
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search