Blog Post

SSIS Pivot on bad data

,

The pivot transform in SSIS is already a pain to use. When you have bad data it makes it even worse. In this blog I am going to show how to handle bad data when using the pivot transform. You should understand how to use the pivot transform already and Devin Knight has a great blog on how to do that here.

Here is the input table:

image

The output table should look like so:

image

This is a situation where the users have entered data and they have left off the types on the input table and therefore we do not know where the value should go. These values will be dropped in this example. When loading the output table we need to pivot the data. Another issue is the IDs of the incoming data are sequential and not matching. So IDs 1-5 are the first row, 6-10 are the second row and so on.

Here is the data flow used to perform all of this work.

image

All of these issues can be handled in SSIS with native task. We will use an aggregate transform in this example. Remember an aggregate transform is an Asynchronous transform and does not perform well if you have a lot of rows. This aggregation could be done with a staging table if that is the case.

Here is the query used to pull the information from the input table:

SELECT ID, isnull(Type,’X') as Type, [Value]

FROM dbo.PivotInput

This will give us the following table:

image

In the pivot transform you can create a column to catch all of the X columns. These are the rows missing the type.

image

After the pivot transform the data will look like the below image in a data viewer:

image

Here you can see that the data has been pivoted but the ID issue still needs to be resolved. You need to place ID 1-5 on the same row and 6-10 on the same row and make this work for all numbers. You will do this with a derived column and the aggregate transform.

The next transform is the derived column. Here you will create a new ID column with the following expression:

image

Now after the derived column the data will look like the below image:

image

Notice now you have a New ID that can be grouped together. The aggregate transform will do this.

Here is how the aggregate transform is set up:

image

Notice you are dropping the X column. You could do a multi cast before this to map those bad rows to another output like a flat file for someone to examine manually.

After all of this we map it to the output and the table looks like so:

Let me know if you have any weird situations like this. I always love a good challenge.

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating