Blog Post

SSIS Merge with Duplicate Rows

,

The Merge component in SSIS will take two sorted sources and union them while maintaining the original sort order. The question arises, what about duplicate records. These duplicates do not get eliminated.

Here are two tables with the ID 5 and the name Brian repeated on each.

imageimage
 
Here is the layout of the data flow. Both sources are using the same query with the different table names.

Select ID,Name
From Merge1
Order by ID

Select ID,Name
From Merge2
Order by ID

image
 
 
Just having the sources sorted does not tell the Merge they are sorted. You have to adjust the Sort property of the sources.

To do this right click on the sources and open the advanced editor. Click on the  Input/Output Properties tab and click on the OLEDB Source output. Then change the IsSorted property to True on the output. Click the plus next to the OLEDB Source output and select the ID column and change the SortKeyPosition to 1.
 
 image

image
 
 image

Then connect them both to the Merge.

I used a terminator destination from Task Factory, this is an add on from Pragmaticworks.com.
 
 image
 
I added a data viewer between the merge and the terminator destination by righting clicking on the green data flow line and selecting data viewer.
 

image
 
Execute the package and look at the data viewer. Notice the column with the ID of 5 is repeated. The Merge does not remove rows.

image

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating