SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 

Loading Non Duplicated Data Using SSIS

By Rahul Gupta,

Introduction

Loading non-duplicated from source is often an exercise needed in the data warehouse environment. In this article, we walk through an example of how all the rows that are duplicated can be removed by using Integration Services (SSIS). Below are the steps given to achieve the same.

Building a Package

For our example, let’s assume that following table (Stage Source 1) is loaded into our environment with data as shown below:

In this rows, with source id 1 are the duplicated rows. We will be developing a workflow using aggregator transformation to remove these duplicated rows as shown in the diagram below:

Configure the OLE DB source connection manager as shown below:

Configure the Aggregator transformation as shown below. Take the count of records and do a group by all the columns present in the table.

Now using the conditional split transformation let’s split the output into unique and duplicated records. The unique records are identified by the condition where record count is equal to one while the duplicated records are identified by the condition where record count is greater than one as shown in the figure below:

After running this step only the unique rows from Source id 2 to 5 (as these are unique records) will be loaded into the OLE DB destination . The duplicated rows will be ignored and not loaded into the OLE DB destination.

Summary

In this article,  I have shown how to load data which don’t have duplicated records in source into the target using SSIS. This is useful in situations where we cannot remove the duplicated  data in the source using sql.  In those scenarios, SSIS can be used to load non-duplicated data into the target.

 
Total article views: 1367 | Views in the last 30 days: 1
 
Related Articles
FORUM

Lookup fails when there is duplicate records in source

Lookup fails when there is duplicate records in source

FORUM

Delete Duplicate Records,

How can i delete duplicate records without changing their Identity values.

FORUM

Duplicate Records

duplicates

SCRIPT

Removing Duplicate Records

How to remove duplicate records from a table.

SCRIPT

Deleting Duplicate Records

In Datawarehousing, a normal scenario is eleminating duplicate records/rows or deleting duplicate re...

 
Contribute