How can I ignore duplicates during an import using SSIS?

  • Every semester I need to import an Excel spreadsheet with a list of students, and their student ID which is unique.

    I wrote an SSIS package that reads in the Excel data, creates a couple derived columns, then imports the data into an SQL table. This works fine if there are no records in the SQL table, but crashes out if there are already students with that ID in the SQL table.

    I read something about doing a Lookup task first, then a Conditional Split, but could not find any examples of what I need to do.

    Anyone have an example, or know of a "how to" article on this subject? Seems like that feature should be built into the import wizard, but I know it's not.

    Thanks in advance!

  • The fastest way to accomplish this task is to create a lookup task and instead of passing on the rows that find a match you pass on the rows that don't find a match (error rows).

    In the Lookup Component click on the Configure Error Output button. In the dialogue set the Lookup Output Error column to Redirect Row. Then select the Red Output Arrow (this is the error output) from the Lookup Component and connect it to the destination. Now only rows that do not exist in the table will be inserted.

    For performance reasons I'd put the Lookup before the Derived Column transform(s).

    Attached are some screenshots.

  • Thanks! This will be a big help!!

Viewing 3 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply