Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 12»»

Ways to address incremental updates in SSIS Expand / Collapse
Author
Message
Posted Thursday, August 23, 2012 2:26 PM


Right there with Babe

Right there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with Babe

Group: General Forum Members
Last Login: Thursday, August 14, 2014 8:21 AM
Points: 753, Visits: 172
Hi fellows,

I'm trying to deploy a datamart, I have the star schema designed and I am working on the ETLs, but the issue is that there is one table that holds around 4.5 million rows and increasing, this table takes about 5 minutes to move from the OLTP system to staging... since the time isn't bad at all and we are using a test server for this, I would like to improve the times, but I haven't got to a good way to make this happen.

I tried using the Lookup data flow transformation, but it start taking 8 minutes and needs to load not just the OLTP's 4.5 million rows, but the ones actually on staging... which increased the time... but gave me just the new rows.

I am using an ADO NET Data source and Destination, I know that the Data source has a way that I can put a SQL Command but lack the knowledge to make that query receive a parameter.

Any help is welcome!!!

Thanks in advance.


Frank.
Post #1349349
Posted Friday, August 24, 2012 12:00 AM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Today @ 12:59 AM
Points: 13,632, Visits: 10,515
What are you doing exactly in the Lookup component? Why does it have to load staging as well?
A lookup component should be able to give you the inserts and the updates.




How to post forum questions.
Need an answer? No, you need a question.
What’s the deal with Excel & SSIS?

Member of LinkedIn. My blog at LessThanDot.

MCSA SQL Server 2012 - MCSE Business Intelligence
Post #1349475
Posted Monday, August 27, 2012 10:04 AM


Right there with Babe

Right there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with Babe

Group: General Forum Members
Last Login: Thursday, August 14, 2014 8:21 AM
Points: 753, Visits: 172
Thanks for your reply.

What I am trying to do is to load into staging only the new data from a table, I would load it all, but I am not sure about the 5 minutes it takes to do it.

How can I use the lookup component to get the inserts without loading staging?

Regards,

Franklin


Frank.
Post #1350466
Posted Monday, August 27, 2012 11:38 PM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Today @ 12:59 AM
Points: 13,632, Visits: 10,515
Well, you can't find the inserts without loading the destination table, as the lookup needs something to compare the data to.

If your source and your staging are SQL Server tables (2008 or up), you can try the MERGE statement. They would have the be on the same server to be practical though.

How did you configure the lookup component? Did you use a query?




How to post forum questions.
Need an answer? No, you need a question.
What’s the deal with Excel & SSIS?

Member of LinkedIn. My blog at LessThanDot.

MCSA SQL Server 2012 - MCSE Business Intelligence
Post #1350692
Posted Tuesday, August 28, 2012 2:17 PM


Right there with Babe

Right there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with Babe

Group: General Forum Members
Last Login: Thursday, August 14, 2014 8:21 AM
Points: 753, Visits: 172
I don´t use the query, just set it to "Use a table or a view" and pick the one from staging that is a copy of it.

Is there way I can use a SQL Query and send over a parameter to it? because this table works like a log, there will be no updates, just inserts, and I have a date column I can use, so if I can send the last date to the query as a parameter, I would be able to get only the new records :)

Thanks.




Frank.
Post #1351248
Posted Tuesday, August 28, 2012 2:41 PM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Today @ 12:59 AM
Points: 13,632, Visits: 10,515
f_ernestog (8/28/2012)
I don´t use the query, just set it to "Use a table or a view" and pick the one from staging that is a copy of it.

Is there way I can use a SQL Query and send over a parameter to it? because this table works like a log, there will be no updates, just inserts, and I have a date column I can use, so if I can send the last date to the query as a parameter, I would be able to get only the new records :)

Thanks.




First of all, use a query where you select only the column you need to perform the lookup. Otherwise you will be selecting every single bit of data of the lookup table into memory, which can be costly.

Do you want to use the parameterized query in the OLE DB Source, or in the lookup component?




How to post forum questions.
Need an answer? No, you need a question.
What’s the deal with Excel & SSIS?

Member of LinkedIn. My blog at LessThanDot.

MCSA SQL Server 2012 - MCSE Business Intelligence
Post #1351264
Posted Tuesday, August 28, 2012 2:49 PM


Right there with Babe

Right there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with Babe

Group: General Forum Members
Last Login: Thursday, August 14, 2014 8:21 AM
Points: 753, Visits: 172
In the OLE Component.

Frank.
Post #1351268
Posted Tuesday, August 28, 2012 2:56 PM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Today @ 12:59 AM
Points: 13,632, Visits: 10,515
You can use parameters there - example - but this can be a bit buggy. The easiest way, in my opinion, is to use a string variable and construct your query dynamically. The downside is that the query can only be 4000 characters long.

A good article on the subject: SSIS Design Pattern - Dynamic SQL

Or you could use a inline table valued function as your OLE DB Source query.




How to post forum questions.
Need an answer? No, you need a question.
What’s the deal with Excel & SSIS?

Member of LinkedIn. My blog at LessThanDot.

MCSA SQL Server 2012 - MCSE Business Intelligence
Post #1351273
Posted Tuesday, August 28, 2012 3:16 PM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Saturday, August 30, 2014 7:18 PM
Points: 6,250, Visits: 7,409
You need a way to identify the new records on the OLTP system so you can limit your 'pull' from that system.

Typical solutions:
1) A LateUpdateDatetime or a Timestamp which you reference from the warehouse for the last 'pull' you made, and use that as a lowerbound for the query that gets records from the OLTP system.

2) CHECKSUMs, which are basically contatonations of the rows coming from the source system simplified (and very slightly error prone). These are checked against the target system (which hard-stores the checksums) and anyone who's different is 'pulled' from the source system. This keeps data-load lighter, since you're not pulling entire rows into memory just to be checked.

3) Staging tables. Mass dump all the data into a staging table at the warehouse and MERGE it into the target table, taking care of all updates and inserts in a single pass. This is the worst case scenario, and requires massive amounts of data streaming. If you're on 2k5 it's worse, you need to double-pass the data with an update pass then an insert pass.

None of these directly address deletions, either.



- Craig Farrell

Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

For better assistance in answering your questions | Forum Netiquette
For index/tuning help, follow these directions. |Tally Tables

Twitter: @AnyWayDBA
Post #1351291
Posted Tuesday, August 28, 2012 3:44 PM


Right there with Babe

Right there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with Babe

Group: General Forum Members
Last Login: Thursday, August 14, 2014 8:21 AM
Points: 753, Visits: 172
Thank you Koen and Craig for your reply.

I will keep in mind Craig's advice, but based on the fact I have a deadline, I will go with passing the variable to the query on the OLE source.

I have already tried that out, the first time is a pain in the behind since it has to load the whole table, but then only the records which date is newer to the max date.

Thanks both of you for your help.

Regards.


Frank.
Post #1351308
« Prev Topic | Next Topic »

Add to briefcase 12»»

Permissions Expand / Collapse