Currently, none of the existing Data Flow sources allow you to source variables. Once again in my quest for performance, I look to keep as much data as possible in memory. There are many uses for pushing data to external files (Raw Files, control files, etc), but there is a difference between doing so, because you need to and doing so because you have no choice. In this post I will show you how to take data output from a stored procedure into local variables and then, using a Script Task Source, build an entire Data Flow Component.
First up is getting the data. I decided to use a stored procedure, since I can get a single record. In this post I will use a set of variables, which correspond to a single record. The stored procedure is created with output parameters.
The output parameters are mapped to SSIS variables:
We have the result of a stored procedure in memory, through the use of variables. Now I need a way to use them in a Data Flow and to do so means I need to either introduce them through a Derived Column or a Data Source; I chose Data Source. To do this I need to create a custom Script Component.
The first thing I do is create the output columns, which is how the columns gathered within the source component are introduced into the data stream. I create a corresponding output for each of the variables filled in by the stored procedure.
Now we have to introduce the variables to into the script component and there are two ways to do this. The first way is to list them in the read-only section of the script component.
I didn't do that this time. I was carrying over a bit of prejudice from SSIS 2005 where I found it easier to work with variables through the VariableDispenser object. I identify the variables I want to insert into the data stream and then pass them down, using the CreateNewOutputRows() method available to a Script Source Component.
Each package variable is added to the corresponding output variable built off the Script Source Component for use by any component within the Data Flow Task.
Now you might be thinking, "That's a neat trick, but why would I want to do that. I want SSIS to pull as many records as possible and push them down then line I want speed!" So do I and for the package I built, this is the way to get it. This is an enhancement to the document loader I built. Pulling hundreds of .pdfs from a directory slowed down workflow considerably. It was more efficient to work on a single document at a time. Now you might be asking 'Why don't I just select from the queue table to get the next document to import?" The answer is: You can't. Stored procedures can't be called from within a Data Flow task and since I want to remove the record from the queue, I needed a stored procedure.
This an enhancement to my Enhanced Threading Framework which I designed. You can find the older slideshow here. I am in the process of enhancing it, including a way to turn engines on and off, and improved performance.