But I know that not everyone is comfortable coding a custom component from scratch - you have to be a code addict like me to find joy in that. So here's a little "gateway drug" for those of you who are itching for the capability. Perhaps a little taste of it will expose you to the script's shortcomings, and lead you to build your own lab to cook up a custom component to do this job properly.
Location, Location, Location
The big key here, as mentioned by Darren Green in an earlier post on the subject, is where to put code to do this. It's not in one of the automatically generated stubs - you'll have to tread into slightly less familiar territory - the ProcessInput method override. The standard stubs (PreExecute, PostExecute, and Input0_ProcessInputRow) are all useless for our purposes - you can safely delete them or ignore them. Why? PreExecute doesn't yet have any information about the buffer, and PostExecute is too late (obviously). Input0_ProcessInputRow is called once per row... but with a carefully wrapped row object that presents the columns in nicely named properties. There's no way to access the collection of columns. Only ProcessInput has access to the buffer in a condition that allows iterating over the columns.
Paste This Code
{
bool fireAgain = true;
ComponentMetaData.FireInformation(0, "",
Buffer.ColumnCount.ToString() + " columns",
"", 0, ref fireAgain);
while (Buffer.NextRow())
{
for (int columnIndex = 0;
columnIndex < Buffer.ColumnCount;
columnIndex++)
{
string columnData = null;
if (Buffer.IsNull(columnIndex))
{
columnData = "is NULL";
}
else
{
BufferColumn columnInfo = Buffer.GetColumnInfo(columnIndex);
switch (columnInfo.DataType)
{
case DataType.DT_BOOL:
columnData = Buffer.GetBoolean(columnIndex).ToString();
break;
case DataType.DT_WSTR:
columnData += Buffer.GetString(columnIndex);
break;
// add code to support more data types here
default:
columnData = "";
break;
}
}
ComponentMetaData.FireInformation(0, "",
"Column " + columnIndex.ToString() + ": " + columnData,
"", 0, ref fireAgain);
}
}
base.ProcessInput(InputID, Buffer);
}public override void ProcessInput(int InputID, PipelineBuffer Buffer)What's Happening In There?It doesn't take much explanation - but every little bit helps.
Initially, a loop has to be constructed to iterate over all the rows we're given. Do remember that this isn't ALL the rows coming through your Data Flow - it's just one buffer's worth. This method will get called several times (unless you have very few rows in your flow).
The column then gets checked to see if it's NULL... because NULLs cause quite a problem when you don't expect to see them.
After that, a little inquiry is made to ask for some information about the column. This bit of code can definitely be optimized out of the loop - it is a burden on the system to ask for it for each and every row!
The reason for getting a little information about the column is apparent in the next block - the switch statement that handles different datatypes differently. You can extend the code to handle the data types you expect.
Finally, after the loop, don't forget to call the base ProcessInput method. Why? Press F12 on that call and you'll see - the base code handles marking the processing as being complete when you've finished seeing all the rows.
Variations On This Theme
The above code isn't the only way to get this done. You can do away with manually detecting the column data type and simply call ToString on the buffer's indexer - as in:
string columnData = Buffer[columnIndex].ToString();But do still beware of nulls and other odd results. Using the "Get" methods specific to the data type do perform faster, and are safer in the long run.



Subscribe to this blog
Briefcase
Print
Posted by Steve Jones on 3 May 2011
Nice one. This is a good trick to keep around in my toolbox.
Posted by mudzana on 4 May 2011
Nice
Posted by vishal.gamji on 5 May 2011
Kewl.
Posted by Richard Campbell on 5 May 2011
I have been looking or a way to do this fo a long time. There is a wy to do this with Input0_PrcocessInputRow but can be broken if you make any changes to the input columns after the script has been created.
Posted by Richard Campbell on 5 May 2011
I forgot to ask, can this be done ib SSIS 2005 (in VB script)? I would expect that it would be possible.
Posted by Richard Campbell on 6 May 2011
Having succeeded in implementing the code in VB today without too much trouble, I have a question. I can use this code to get the column values but how do I get the column names?
I am trying to develop a generic script that will consume rows of data and submit them to a Web Service. To do this I need the column names as well as the values.
Any help would be much appreciated.
Posted by tmcdermid on 29 May 2011
@Richard - It won't be easy for you to get the column names. From the script... I'm not entirely sure it's possible, but it should be.
In the PreExecute method, you can iterate over the ComponentMetaData.InputCollection[0].InputColumnCollection collection. Each element in there will have a .Name. Once you've found the name(s) you're looking for, then you'll have to make a call to BufferManager.FindColumnByLineageId, and pass it the LineageID from the column you just found. Unfortunately, I can't find a way to get at the PipelineComponent instance that is the ScriptComponent. BufferManager is a property of PipelineComponent, but ScriptComponent doesn't expose it... even though it has to be an instance of PipelineComponent somewhere along a chain of inheritances.
If you're looking to do some dynamic stuff that relates to the column name, you probably have to code a custom component. It's not terribly hard. If you're the Richard Campbell based in Vancouver, drop by one of the VanPASS meetings with your code, and within a half hour, I'm sure I'll be able to take your Script code and turn it into a custom component...