SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Ways to address incremental updates in SSIS


Ways to address incremental updates in SSIS

Author
Message
f_ernestog
f_ernestog
SSC Eights!
SSC Eights! (807 reputation)SSC Eights! (807 reputation)SSC Eights! (807 reputation)SSC Eights! (807 reputation)SSC Eights! (807 reputation)SSC Eights! (807 reputation)SSC Eights! (807 reputation)SSC Eights! (807 reputation)

Group: General Forum Members
Points: 807 Visits: 186
Hi fellows,

I'm trying to deploy a datamart, I have the star schema designed and I am working on the ETLs, but the issue is that there is one table that holds around 4.5 million rows and increasing, this table takes about 5 minutes to move from the OLTP system to staging... since the time isn't bad at all and we are using a test server for this, I would like to improve the times, but I haven't got to a good way to make this happen.

I tried using the Lookup data flow transformation, but it start taking 8 minutes and needs to load not just the OLTP's 4.5 million rows, but the ones actually on staging... which increased the time... but gave me just the new rows.

I am using an ADO NET Data source and Destination, I know that the Data source has a way that I can put a SQL Command but lack the knowledge to make that query receive a parameter.

Any help is welcome!!!

Thanks in advance.

Frank.
Koen Verbeeck
Koen Verbeeck
SSC Guru
SSC Guru (59K reputation)SSC Guru (59K reputation)SSC Guru (59K reputation)SSC Guru (59K reputation)SSC Guru (59K reputation)SSC Guru (59K reputation)SSC Guru (59K reputation)SSC Guru (59K reputation)

Group: General Forum Members
Points: 59951 Visits: 13297
What are you doing exactly in the Lookup component? Why does it have to load staging as well?
A lookup component should be able to give you the inserts and the updates.


How to post forum questions.
Need an answer? No, you need a question.
What’s the deal with Excel & SSIS?
My blog at SQLKover.

MCSE Business Intelligence - Microsoft Data Platform MVP
f_ernestog
f_ernestog
SSC Eights!
SSC Eights! (807 reputation)SSC Eights! (807 reputation)SSC Eights! (807 reputation)SSC Eights! (807 reputation)SSC Eights! (807 reputation)SSC Eights! (807 reputation)SSC Eights! (807 reputation)SSC Eights! (807 reputation)

Group: General Forum Members
Points: 807 Visits: 186
Thanks for your reply.

What I am trying to do is to load into staging only the new data from a table, I would load it all, but I am not sure about the 5 minutes it takes to do it.

How can I use the lookup component to get the inserts without loading staging?

Regards,

Franklin

Frank.
Koen Verbeeck
Koen Verbeeck
SSC Guru
SSC Guru (59K reputation)SSC Guru (59K reputation)SSC Guru (59K reputation)SSC Guru (59K reputation)SSC Guru (59K reputation)SSC Guru (59K reputation)SSC Guru (59K reputation)SSC Guru (59K reputation)

Group: General Forum Members
Points: 59951 Visits: 13297
Well, you can't find the inserts without loading the destination table, as the lookup needs something to compare the data to.

If your source and your staging are SQL Server tables (2008 or up), you can try the MERGE statement. They would have the be on the same server to be practical though.

How did you configure the lookup component? Did you use a query?


How to post forum questions.
Need an answer? No, you need a question.
What’s the deal with Excel & SSIS?
My blog at SQLKover.

MCSE Business Intelligence - Microsoft Data Platform MVP
f_ernestog
f_ernestog
SSC Eights!
SSC Eights! (807 reputation)SSC Eights! (807 reputation)SSC Eights! (807 reputation)SSC Eights! (807 reputation)SSC Eights! (807 reputation)SSC Eights! (807 reputation)SSC Eights! (807 reputation)SSC Eights! (807 reputation)

Group: General Forum Members
Points: 807 Visits: 186
I don´t use the query, just set it to "Use a table or a view" and pick the one from staging that is a copy of it.

Is there way I can use a SQL Query and send over a parameter to it? because this table works like a log, there will be no updates, just inserts, and I have a date column I can use, so if I can send the last date to the query as a parameter, I would be able to get only the new records Smile

Thanks.

Frank.
Koen Verbeeck
Koen Verbeeck
SSC Guru
SSC Guru (59K reputation)SSC Guru (59K reputation)SSC Guru (59K reputation)SSC Guru (59K reputation)SSC Guru (59K reputation)SSC Guru (59K reputation)SSC Guru (59K reputation)SSC Guru (59K reputation)

Group: General Forum Members
Points: 59951 Visits: 13297
f_ernestog (8/28/2012)
I don´t use the query, just set it to "Use a table or a view" and pick the one from staging that is a copy of it.

Is there way I can use a SQL Query and send over a parameter to it? because this table works like a log, there will be no updates, just inserts, and I have a date column I can use, so if I can send the last date to the query as a parameter, I would be able to get only the new records Smile

Thanks.




First of all, use a query where you select only the column you need to perform the lookup. Otherwise you will be selecting every single bit of data of the lookup table into memory, which can be costly.

Do you want to use the parameterized query in the OLE DB Source, or in the lookup component?


How to post forum questions.
Need an answer? No, you need a question.
What’s the deal with Excel & SSIS?
My blog at SQLKover.

MCSE Business Intelligence - Microsoft Data Platform MVP
f_ernestog
f_ernestog
SSC Eights!
SSC Eights! (807 reputation)SSC Eights! (807 reputation)SSC Eights! (807 reputation)SSC Eights! (807 reputation)SSC Eights! (807 reputation)SSC Eights! (807 reputation)SSC Eights! (807 reputation)SSC Eights! (807 reputation)

Group: General Forum Members
Points: 807 Visits: 186
In the OLE Component.

Frank.
Koen Verbeeck
Koen Verbeeck
SSC Guru
SSC Guru (59K reputation)SSC Guru (59K reputation)SSC Guru (59K reputation)SSC Guru (59K reputation)SSC Guru (59K reputation)SSC Guru (59K reputation)SSC Guru (59K reputation)SSC Guru (59K reputation)

Group: General Forum Members
Points: 59951 Visits: 13297
You can use parameters there - example - but this can be a bit buggy. The easiest way, in my opinion, is to use a string variable and construct your query dynamically. The downside is that the query can only be 4000 characters long.

A good article on the subject: SSIS Design Pattern - Dynamic SQL

Or you could use a inline table valued function as your OLE DB Source query.


How to post forum questions.
Need an answer? No, you need a question.
What’s the deal with Excel & SSIS?
My blog at SQLKover.

MCSE Business Intelligence - Microsoft Data Platform MVP
Evil Kraig F
Evil Kraig F
SSCoach
SSCoach (19K reputation)SSCoach (19K reputation)SSCoach (19K reputation)SSCoach (19K reputation)SSCoach (19K reputation)SSCoach (19K reputation)SSCoach (19K reputation)SSCoach (19K reputation)

Group: General Forum Members
Points: 19721 Visits: 7660
You need a way to identify the new records on the OLTP system so you can limit your 'pull' from that system.

Typical solutions:
1) A LateUpdateDatetime or a Timestamp which you reference from the warehouse for the last 'pull' you made, and use that as a lowerbound for the query that gets records from the OLTP system.

2) CHECKSUMs, which are basically contatonations of the rows coming from the source system simplified (and very slightly error prone). These are checked against the target system (which hard-stores the checksums) and anyone who's different is 'pulled' from the source system. This keeps data-load lighter, since you're not pulling entire rows into memory just to be checked.

3) Staging tables. Mass dump all the data into a staging table at the warehouse and MERGE it into the target table, taking care of all updates and inserts in a single pass. This is the worst case scenario, and requires massive amounts of data streaming. If you're on 2k5 it's worse, you need to double-pass the data with an update pass then an insert pass.

None of these directly address deletions, either.


- Craig Farrell

Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

For better assistance in answering your questions | Forum Netiquette
For index/tuning help, follow these directions. |Tally Tables

Twitter: @AnyWayDBA
f_ernestog
f_ernestog
SSC Eights!
SSC Eights! (807 reputation)SSC Eights! (807 reputation)SSC Eights! (807 reputation)SSC Eights! (807 reputation)SSC Eights! (807 reputation)SSC Eights! (807 reputation)SSC Eights! (807 reputation)SSC Eights! (807 reputation)

Group: General Forum Members
Points: 807 Visits: 186
Thank you Koen and Craig for your reply.

I will keep in mind Craig's advice, but based on the fact I have a deadline, I will go with passing the variable to the query on the OLE source.

I have already tried that out, the first time is a pain in the behind since it has to load the whole table, but then only the records which date is newer to the max date.

Thanks both of you for your help.

Regards.

Frank.
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search