Blog Post

Using the Error Output of the Derived Column

,

The other day I was building a simple data import package when I ran into a common situation; invalid conversions in a Derived Column.  The conversion was taking an ISBN13 which is usually defined as char(13) and converting it into a bigint.  All valid ISBN13s are valid bigint.  The flat file source I was importing had invalid ISBNs in it, the first set of which where simply blank.  I was about to modify my derived column to shunt all blank ISBNs to 0 when it turns out there is a second set of bad data; invalid ISBNs.  These are either ISBN10s, which contain a letter in the 10th position, or plain character garbage.  This changes the whole plan of attack.

The simple derived column was going to turn into a nightmare of nested if/then constructs to convert those invalid values into a 0.  To try and capture a sample of all invalid ISBNs, I modified the Data Flow as follows:

output_stage_1.png

The error output is on Failure, not truncation, because I want to see all invalid data and I know this is more than just numbers that are too large for the data type.  Then it hit me.  There right in front of me was the solution.  With a little tweaking:

output_stage_1.png

I know the Error Output is an invalid ISBN and I want that to be converted to a 0.  Open up the Invalid ISBN component and you simply see this:

output_invalid_isbn.png

There you go.  If you have the potential for numerous invalid values in a specific column, rather than build complex Derived Column formulas, build the one formula that is valid, shunt the invalid records to another data path and put your default values in.

Original post (opens in new tab)

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating