Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 

The Voice of the DBA

Steve Jones is the editor of SQLServerCentral.com and visits a wide variety of data related topics in his daily editorial. Steve has spent years working as a DBA and general purpose Windows administrator, primarily working with SQL Server since it was ported from Sybase in 1990. You can follow Steve on Twitter at twitter.com/way0utwest

Scripting in SSIS

Tim Mitchell presented at SQLSaturday #17 on Scripting in SSIS. I thought it was a really interesting presentation, and I’m glad I went. I haven’t done much SSIS work, and very little with scripting, so I learned a few things right off.

Tim is a great contributor here at SQLServerCentral, and I was happy to support him. He made a nice intermediate presentation on scripting in SSIS and I hope he writes a few articles.

The presentation started with a little background on how the scripting works in SSIS. Tim notes that VB.NET only could be used in 2005, but C# support was in SQL Server 2008. He also said that there were two scripting items in SSIS. The Sscript task in the data flow is where his presentation focused.

If the only tool you have is a hammer, everything looks like a nail. He used that analogy and said if you can do something with another task, use that. Don’t need to use scripting for everything,. Scripting works well when you can’t do something with native tools, or when it is very cumbersome.

This is lightweight programming, but it is programming. There can be performance issues as well since you’re essentially writing software. This also can be good job security if you learn to do it well. He mentioned a demand for SSIS jobs, which I’ve seen as well.

As far as debugging goes, there are no good end user feedback from the tools. This can be challenging if you are not a developer.

The Script task in the control flow typically is used for anything except moving data. It can, but usually used to work with OS, filesystem, external programs, variables, etc..

Script component is for data flow, used for data manipulation. Can be useful for branching, thought the presentation didn’t focus on that.

Tim works a lot with hospitals,and patient data. The script component works well dealing with the various unconventional inputs that he sees like:

  • semi structured
  • nonlinear files
  • multiple lines of text
  • varying numbers of columns
  • dissimilar data types
  • record type formats.

I haven’t done a lot of SSIS work, so it’s good to see the different real world types of issues people deal with. It might come in handy in the future if I run into one of these situations.

Tim had a nice record import example with characters from “The Office”, using characters and situations. Jim with a pencil stuck in his hand, Meredith’s car accident. and Dwight’s overdose of beets. That made me smile seeing it.

Then he used a script to output to three tables, 3 outputs, from one input. A script component allows n outputs. The outputs are set to the name and datatype for each output table or destination.

The edit script can be intimidating, but it shows real programming code. The CreateNewOutputRows() method that reads the file and loops through the stream. Each row is read in the loop and if it matches a particular value, it is added to the buffer for a particular output. This needs to be done before the values are added to the various elements of the output.

There was no error handling in the script, which I hate to see, but I understand trying to keep things simple. I would prefer to see this included and then skip over/mention the fact that it is in there. It’s easy to comment on TRY..CATCH blocks, but show a good use of the error handling structure.

The second example was the reverse of the first. Take data from multiple sources and combine it into one record. The interesting thing here is that the input needs to be un-synchronized from the input. Not all rows are going to be merged. The third example is to get multiple lines per input. For example, a report type item that describes a row in the db across multiple rows in the report. This was an interesting use of the Script component and basic programming to handle this structure. The fourth example reversed the third.

I thought this was a slightly dry session, and I have sent some specific notes to Tim for future work. Not that I’m a great presenter, but I noticed a few things that could be better. However as a basic introduction to the script component, I thought this was fantastic.

Comments

Posted by Tim Mitchell on 7 August 2009

Steve, thanks for sitting in on the session, and for the feedback as well.

Posted by Alvin Ramard on 7 August 2009

Scripting in the data flow certainly adds a lot of possibilities to what can do.  The only thing I have used it for so far is to remove non-printing and extended ascii characters from an input file.  The script makes use of search and replace using regular expressions.

Leave a Comment

Please register or log in to leave a comment.