Blog Post

Simple Series : SSIS Conditional Split

,

Conditional Split is a very powerful component, enabling you to send records in different directions or exclude them from the downstream altogether.  I use this component quite often in my SSIS packages.  In the industry I work in, publishing, there are many times we'll get a file in from an aggregation company or a large distributor of our books (like Amazon).  Many times the ISBN is poorly formatted and this might be for any number of reasons.  A good example of this is we generally get this aggregated information via excel sheet (yes, because humans love excel) and if you don't format the sheet right, an ISBN10, which starts '00', will have the leading zeros truncated.  We might have an isbn with the dashes in place, much like you see on the Barnes&Noble site.  We have processes to scrub this data, but in the end they might have bad data, so I use a conditional split to extract this data out and push it to an error section, be it an error template we mail back to a person and/or a local table that we store this information in.  I have found at times embedded control totals.  

We love control totals as it gives us numbers to compare our process to.  If we add up a bunch of numbers and our numbers match the control totals given, we know we are doing things the same way the aggregator is.  This is wonderful, except we do not want to control totals in the item level table, that is what will will sum to match control totals.  The control totals go into a different table.  This is where a gotcha with Conditional Split comes in.

Have you ever noticed the numbers on the left hand side and the up and down arrows on the right of the component.  Neither did I until I got bit.  Conditional Split component compares each condition in that order, top to bottom, first to last.   This means that if you have a very restrictive conditional after a less restrictive condition, you run the risk of the first condition (less restrictive) meeting the criteria of the data you wanted for the more restrictive.  This means you data will go down the wrong path.  Let me explain again, with pictures:

 

img 1

It turns out that in the excel source, all of the columns except C103_NetAmount are null.  That means the InvalidISBN will match the columns that are destined for GrandTotal.  The GrandTotal condition is much more restrictive than the BadISBN condition.  The simple way to fix this is to select GrandTotal and then click the up_arrow.  This will move the GrandTotal condition up to number 1.  The result is below:

img2

Remember to put your conditions in top to bottom order; Most restrictive to least restrictive.  I also tend to put the fewest number of checks within the condition.  I have nearly a dozen columns in the table and I could have listed the value of each one as it pertains to the condition, but why?  I select a column or two that meet the condition and use it.  In this case there was an overlap, so I simply rearranged the order.  This is much simpler than have a 10+ condition.  Very hard to read and potentially hard to maintain.

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating