Blog Post

Inserting Records AND Getting The Identity in SSIS - Part 2

,

Since I posted a ways back on retrieving the identity value for inserts in SQL Server Integration Services, I've learned many things - one of which is an inline solution for that problem I didn't know back then.  Check out that previous post for the business problem, then come back here for an "inline" solution using SSIS - without scripts - just regular SSIS tasks and components.
It's a Love/Hate Relationship
The OLE DB Command component in SSIS is a near-universally hated object due to it's inherent performance problems.  But it's so hard to hate when it can actually get the job done when nobody else can.
Yes, our old nemesis is back to solve a problem for us... at the same price it's always asked.  Do you want your packages to run fast, but be complicated?  Or do you want them to run slow, but be easy to understand?  Sadly, there is not much middle ground.
The Keys
It turns out to be pretty easy, and I should have got it into my thick skull before, from all the hints I'd read.  A stored procedure with an OUTPUT parameter, and the OLE DB Command are the keys.  For this example, assume we're using a table that has only two columns: ID and TEXTVALUE, with ID being the automatically generated identity column.
The Stored Procedure
Unfortunately, this doesn't use much of the GUI power of SSIS, it's old hand-coding.  For our simple table, this would be the stored procedure definition:

CREATE PROCEDURE InsertData
  
@textvalue AS CHAR(10),
  
@id AS INT OUTPUTAS
INSERT INTO
IdentityTable
  
(textvalue)
  
VALUES
  
(@textvalue)
SET
@id = SCOPE_IDENTITY()

The SSIS Data Flow
Use a Derived Column to generate the ID column with a NULL(DT_I4) value in it.  (We need somewhere to place our identity value in.)

Following that, we can work on the OLE DB Command component, using a statement like this:

EXEC InsertData ?, ? OUTPUT

Map the columns as you'd expect to map them.
If you'd like I've got an SSIS 2008 package showing this technique.
The Big Picture

Sure, it works - but here are the drawbacks:

It's going to perform badly. Horribly if the package is run on a different server from the database the stored procedure is on.  I can't stress that enough.  The OLE DB Command component is making a round-trip call to the database for each row.  Even on the same machine, that's an expensive operation.
You're doing an RBAR (row by agonizing row) process which is much, much slower than a set-based process that SQL Server is more suited for.  (SSIS is fine with RBAR, it's designed that way.)  Neither SQL Server process nor the Integration Services runtime will be working very hard during this part of the data flow - I don't think even one core would get utilized fully due to the wait-states involved.
If you're interested, here's a sample package to show this process.  It creates a database on your local machine, creates the sample table, generates one row, performs the insert (and shows the result to you), then tears the database down.
Can we mitigate this RBAR flaw?  Stick around or subscribe your RSS reader to find out in an upcoming post.

Original post (opens in new tab)

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating