Derived Columns - Proper Use

  • Hi Group,

    I've built a package that among other things, takes a field called "JType" and maps each of its possible 103 values into 7 categories. I did this by first doing a conditional split off of the values into separate flows for each of the 7 categories. In the Conditional Split Transform, the expression for EACH category contained OR logic for anywhere between 4 and 10 string values. From there, on each of the 7 separate output data flows (per category), I tacked on a Derived Column Transform deriving, assigning, and adding a col for the Category Name (7 different values). I then Union All'd them back together into one stream.

    It works, which for someone less than 3 mos into the BI Stack is a very important priority. Beyond that, it looks pretty cool in the SSIS Designer, too. What I'm wondering is was this the right approach from a logical, efficiency, and performance standpoint? Was it the "right" way to go about it?

    TIA for any responses.

  • This sounds like a reasonable approach.

    The way I normaly handle things like this , would be to create a look-up table with all the JType values add a column with the category name.

    Then do a look-up transformation (or a join on the datasource query) to match the category name to the Jtype , then add the categoryname as a new column on the data flow.

  • Ok, I can see that. Sounds like from your first remark, I don't need to be embarrassed about the way I did it, but the Lookup transform definitely seems like a cleaner, more maintainable solution. Thanks for the improvement.

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply