SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Incremental Load using SQL SERVER 2008


Incremental Load using SQL SERVER 2008

Author
Message
rahulsahay123
rahulsahay123
SSC-Enthusiastic
SSC-Enthusiastic (129 reputation)SSC-Enthusiastic (129 reputation)SSC-Enthusiastic (129 reputation)SSC-Enthusiastic (129 reputation)SSC-Enthusiastic (129 reputation)SSC-Enthusiastic (129 reputation)SSC-Enthusiastic (129 reputation)SSC-Enthusiastic (129 reputation)

Group: General Forum Members
Points: 129 Visits: 153
I have to perform incremental load using ssis. The workflow and architecture is as follows:

I have an etl process in place which will pull data from Server_A (oracle application) and dump the data to Server_B ( also called staging server SQL SERVER 2008). Once data is dumped, then business logic is applied and the using the etl process transformed data is dumped to the target tables(SQL SERVER). ETL process runs to create the csv's of the target tables and is send to the 3rd party who maintains the datawarehouse using the csv's.
Now I have to pull the delta only from Server_A to Server_B and have to maintain the history also. I have checked the Server_A tables. there are no modified columns or flag columns on which I can rely and calculate the delta. I cannot use SCD on Server_A neither I can use CDC.
I tried finding the delta from Server_A by using the checksum equivalent in oracle. But the problem is that I am not able to create the history.
Other approach is to dump all the tables to staging and then apply checksum. but this will hit the performance and moreover client doesnot approves this architecture.
I have in all around 70 tables in Server_A which I transfers to Server_B when ever I am refereshing the data.

Can you please help me in building the logic for calculating the delta?

Thanks
Rahul Sahay
Email: rahulsahay123@gmail.com
Daniel Bowlin
Daniel Bowlin
SSCrazy Eights
SSCrazy Eights (8.1K reputation)SSCrazy Eights (8.1K reputation)SSCrazy Eights (8.1K reputation)SSCrazy Eights (8.1K reputation)SSCrazy Eights (8.1K reputation)SSCrazy Eights (8.1K reputation)SSCrazy Eights (8.1K reputation)SSCrazy Eights (8.1K reputation)

Group: General Forum Members
Points: 8144 Visits: 2629
It sounds as though your hands are tied and you don't really have good options. The only thing you haven't ruled out is to do a comparison of the data as you load it from Oracle to your staging server and don't write the data that has not changed to your staging target. I would use the hashbytes function rather than a checksum because it is less likely to have a colision of data. Also make sure you trim both your source and target when comparing the data.
rahulsahay123
rahulsahay123
SSC-Enthusiastic
SSC-Enthusiastic (129 reputation)SSC-Enthusiastic (129 reputation)SSC-Enthusiastic (129 reputation)SSC-Enthusiastic (129 reputation)SSC-Enthusiastic (129 reputation)SSC-Enthusiastic (129 reputation)SSC-Enthusiastic (129 reputation)SSC-Enthusiastic (129 reputation)

Group: General Forum Members
Points: 129 Visits: 153
But when I am passing only delta to the staging server then I am not able to build the history. Because checksum in sqlserver and hashfunction in oracle cannot be compared. Henced I have to dump everything onto staging and calculate the delta.
Now client is not approving this approach.
They wants delta to come from server A. But problem with this approach is history building. Example table1 in server A has values 1, 2, 3. And table2 has value 1, 2. Henced by the end of pass 1 delta is 3. Now when I am running the package for 2nd time then delta 3 should be appended to table2 otherwise we will always get cumulative delta.
Since server A is an oracle application ie legacy system, I have only select rights. I dont have access for creation or insertion.
aaron.reese
aaron.reese
SSCrazy
SSCrazy (2.5K reputation)SSCrazy (2.5K reputation)SSCrazy (2.5K reputation)SSCrazy (2.5K reputation)SSCrazy (2.5K reputation)SSCrazy (2.5K reputation)SSCrazy (2.5K reputation)SSCrazy (2.5K reputation)

Group: General Forum Members
Points: 2475 Visits: 907
If there is no field on the table to identify the last time a record was changed, how do they expect you to define the delta.

Your options:

Full snapshot of the table for yesterday and today and compare records based on primary key - slow and takes a lot of space if the tables are large.

Set a trigger on the source table to write to a changed record log table and use this as the source for the delta extract

If you know that records are only modified within (say) 5 days of their creation date you could pull all records that were created in the last 5 days and treat that as a delta and let the 3rd party handle the SCD issue.
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search