• Koen Verbeeck (11/26/2014)


    riyaziq (11/25/2014)


    As Lowell pointed out, you are inserting duplicate primary key values into the table which is not permitted.

    You need to add a 'Sort' component in your Data Flow task.Right click and Edit 'Sort' component and specify by checking all the fields that form your primary key. Ensure to check the option that says "Remove duplicates" at the bottom. eg: If your Primary Key is "EmployeeID", select only "EmployeeID" and check the option to remove duplicates. If your PK is a combination of two or more fields like "FirstName" and "LastName", check both and the option to remove duplicates.

    Good Luck !

    Remember that the sort component is a blocking component and will read all data into memory before producing even one single row of output.

    This is a no-go for large data sets.

    Koen Verbeeck,

    I just implemented a Data Flow task with a Sort component using a large dataset and as you pointed out it's sometimes slow. Could you tell me what's the ideal way of dealing with this situation? I am thinking of this approach: Remove primary key specification so that the table accepts duplicate values and then using a separate Execute SQL Query task remove duplicates :

    ALTER TABLE [Table Name] ADD AUTOID INT IDENTITY(1,1)

    DELETE FROM [Table Name] WHERE AUTOID NOT IN (SELECT min(autoid) FROM [Table Name]