Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 

Tim Mitchell

Tim Mitchell is a business intelligence consultant, author, trainer, and SQL Server MVP with over a decade of experience. Tim is the principal of Tyleris Data Solutions and is a Linchpin People teammate. Tim has spoken at international, regional, and local venues including the SQL PASS Summit, SQLBits, SQL Connections, SQL Saturday events, and various user groups and webcasts. He is a board member at the North Texas SQL Server User Group in the Dallas area. Tim is coauthor of the book SSIS Design Patterns, and is a contributing author on MVP Deep Dives 2. You can visit his website and blog at TimMitchell.net or follow him on Twitter at twitter.com/Tim_Mitchell.

Updating Existing Data with SSIS

I see a lot of questions on the forums about updating existing data in SSIS.  When the update is dynamic and is based on elements within your data flow, a common mistake is to use the Ole DB Command within the data flow pane:

figure1 

The above poorly designed update has us retrieving data from some source, and then running an update on our target database based on one or more values in the source.  Does it work?  Sure.  The problem occurs when you touch more than a handful of rows.  When you bring in the Ole Db command into the picture in a data flow, you’ll be firing the statement in this control once for every row of data in your pipeline.  So let’s say our source retrieves 100,000 rows into the data flow: the downstream UPDATE command will be executed 100,000 times!  Such operations could bring the most capable server to its knees.

A better solution could include staging your data.  Using this method, you’ll retrieve the data from the source and write it into a staging table in the destination database.  You can then use an Execute SQL task to run your update in a more organic manner.

figure2

Set up your data flow as shown above to pull in the data to a staging table, then you can run a single UPDATE statement:

UPDATE i
SET InvoiceAmount = st.UpdatedInvoiceAmount
FROM Invoices i
INNER JOIN StagedData st ON i.InvoiceID = st.InvoiceID

The advantage here is that you’re executing the expensive UPDATE statement once for each table rather than once for each row affected.

Of course, there are some situations that explicitly disallow the use of staging tables in destination systems.  If storage or access restrictions keep you from using this method, you may have to use the row-by-row insert, so be aware that it's going to be a bottleneck.

Comments

Posted by Anonymous on 10 September 2009

Pingback from  Dew Drop – September 10, 2009 | Alvin Ashcraft's Morning Dew

Posted by yogesh_t_r on 19 July 2012

We use a Temp table instead of a dedicated staging table. Its Brief and to the point thx.

Leave a Comment

Please register or log in to leave a comment.