SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


A SSIS Design Pattern for Change Data Capture


A SSIS Design Pattern for Change Data Capture

Author
Message
rarpal
rarpal
Forum Newbie
Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)

Group: General Forum Members
Points: 5 Visits: 240
Comments posted to this topic are about the item A SSIS Design Pattern for Change Data Capture
Koen Verbeeck
Koen Verbeeck
SSC Guru
SSC Guru (98K reputation)SSC Guru (98K reputation)SSC Guru (98K reputation)SSC Guru (98K reputation)SSC Guru (98K reputation)SSC Guru (98K reputation)SSC Guru (98K reputation)SSC Guru (98K reputation)

Group: General Forum Members
Points: 98706 Visits: 13320
A had a quick glance through the article. The title says it's about CDC, but I don't see anything about the change data capture functionality introduced in SQL Server 2008. Instead, you roll your own alternative.

In the first dataflow, you do a full outer join between the staging table and the destination table. Can't that introduce locking? Suppose you have a very large dimension and the source is still reading while you are already inserting new rows. Since the OLE DB Destination with Fast Load takes by default a table lock, this might introduce issues.

Your article starts with the following sentence:

Within Integration Services there is no single out of the box components to handle CDC


That is simply not true. There are build-in components available for CDC in SSIS 2012 (CDC Control Task, CDC Source and CDC Splitter). This article is published in june 2012, and SQL Server 2012 has been publicly available for some time now, so you can't just ignore that edition.

In my opinion, the solution this article proposes is very good for SQL Server 2005, but not needed anymore starting from SQL Server 2008 (unless you don't have Enterprise Edition of course).

edit: if Enterprise Edition is not availabe, I would rather consider the MERGE statement for SQL Server 2008 and up.


How to post forum questions.
Need an answer? No, you need a question.
What’s the deal with Excel & SSIS?
My blog at SQLKover.

MCSE Business Intelligence - Microsoft Data Platform MVP
Koen Verbeeck
Koen Verbeeck
SSC Guru
SSC Guru (98K reputation)SSC Guru (98K reputation)SSC Guru (98K reputation)SSC Guru (98K reputation)SSC Guru (98K reputation)SSC Guru (98K reputation)SSC Guru (98K reputation)SSC Guru (98K reputation)

Group: General Forum Members
Points: 98706 Visits: 13320
Another question:

you claim the following:

Efficient in handling large volumes of data


About how much volume are we speaking here?
It seems to me that for very large volumes the FULL OUTER JOIN can get problematic.


How to post forum questions.
Need an answer? No, you need a question.
What’s the deal with Excel & SSIS?
My blog at SQLKover.

MCSE Business Intelligence - Microsoft Data Platform MVP
clucasi
clucasi
Mr or Mrs. 500
Mr or Mrs. 500 (587 reputation)Mr or Mrs. 500 (587 reputation)Mr or Mrs. 500 (587 reputation)Mr or Mrs. 500 (587 reputation)Mr or Mrs. 500 (587 reputation)Mr or Mrs. 500 (587 reputation)Mr or Mrs. 500 (587 reputation)Mr or Mrs. 500 (587 reputation)

Group: General Forum Members
Points: 587 Visits: 490
I had to design a package very similar to the one demonstrated but I had the problem that the COU had to be done in sequence so unfortunately Rbar was implemented in a script task.
chris.j.clague
chris.j.clague
Forum Newbie
Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)

Group: General Forum Members
Points: 3 Visits: 142
Really, staging table and a merge statement is all you need (2008+). For 2005 I'd just use a proc to replicate the functionality of the merge statement. SSIS is massive overkill for this kind of function (and many other examples I see on a daily basis).
Koen Verbeeck
Koen Verbeeck
SSC Guru
SSC Guru (98K reputation)SSC Guru (98K reputation)SSC Guru (98K reputation)SSC Guru (98K reputation)SSC Guru (98K reputation)SSC Guru (98K reputation)SSC Guru (98K reputation)SSC Guru (98K reputation)

Group: General Forum Members
Points: 98706 Visits: 13320
chris.j.clague (6/13/2012)
Really, staging table and a merge statement is all you need (2008+). For 2005 I'd just use a proc to replicate the functionality of the merge statement. SSIS is massive overkill for this kind of function (and many other examples I see on a daily basis).


What is the overkill exactly between the sp and the SSIS package?


How to post forum questions.
Need an answer? No, you need a question.
What’s the deal with Excel & SSIS?
My blog at SQLKover.

MCSE Business Intelligence - Microsoft Data Platform MVP
epinchuk 79984
epinchuk 79984
Forum Newbie
Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)

Group: General Forum Members
Points: 3 Visits: 5
I don't understand how it could work with huge amount of data because the solution uses CHECKSUM function which is not reliable.
Example:
select binary_checksum('51;52;56;2204;') produces 1726190947
select binary_checksum('51;53;56;2205;') produces 1726190947

select binary_checksum('51;52;56;2205;') produces 1726190963
select binary_checksum('51;53;56;2204;') produces 1726190963

As you can see different input strings produces the same output.
Those cases are not so rare if if have to work with millions of records.
In general, we are using similar approach to load data to our DW but we had to write custom script component for comparing.
clementhuge
clementhuge
SSC Veteran
SSC Veteran (207 reputation)SSC Veteran (207 reputation)SSC Veteran (207 reputation)SSC Veteran (207 reputation)SSC Veteran (207 reputation)SSC Veteran (207 reputation)SSC Veteran (207 reputation)SSC Veteran (207 reputation)

Group: General Forum Members
Points: 207 Visits: 301
i agree with the comment above. merge statement is fabulous. you got your data instaging with a bulk load insert. then you can handle updates, deletes and inserts the way you want between staging and destination tables.
before use audit columns and tables to handle what to extract (i use cozyroc to handle dynamic meta data) and in 2k8, cdc was introduced.

Clement
dinnebieri
dinnebieri
SSC-Enthusiastic
SSC-Enthusiastic (100 reputation)SSC-Enthusiastic (100 reputation)SSC-Enthusiastic (100 reputation)SSC-Enthusiastic (100 reputation)SSC-Enthusiastic (100 reputation)SSC-Enthusiastic (100 reputation)SSC-Enthusiastic (100 reputation)SSC-Enthusiastic (100 reputation)

Group: General Forum Members
Points: 100 Visits: 48
Using the BINARY_CHECKSUM introduces a limitation to your solution. The BINARY_CHECKSUM function will only detect changes to columns of comparable types. This means that if you have columns declared as type text, ntext, image, cursor, xml, etc. then changes to those won't be detected.
rarpal
rarpal
Forum Newbie
Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)

Group: General Forum Members
Points: 5 Visits: 240
The solution described in this article was based on an application initially implemented on 2005 where there were no built in CDC components. The underlying database server was subsequently upgraded to 2008. Since this solution was build using only the standard components, it continued to work with the 2005 backward compatibility pack without any downtime.

You are absolutely right about the CDC features in 2008, however this solution ensured that the application continued to worked pre and post 2008 upgrade without interruption.
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search