Duplicate key issue when Insert/Select

  • I have an issue where I am getting an error on an unique index.

    I know why I am getting the error but not sure how to get around it.

    The query does a check on whether a unique value exists in the Insert/Select. If I run it one record at a time (SELECT TOP 1...) it works fine and just won't update it if the record exists.

    But if I do it in a batch, I get the error. I assume this is because it does the checking on the file before records are written out and then writes out the records one at a time from a temporary table.

    It thinks all the records are unique because it compares the records one at a time to the original table (where there would be no duplicates). But it doesn't check the records against each other. Then when it actually writes out the record, the duplicate is there.

    How do I do a batch where the Insert/Select would write out the records without the duplicates as it does when I do it one record at a time.

    CREATE TABLE #TestTable

    (

    Name varchar(50),

    Email varchar (40)

    )

    Insert #TestTable (Name,Email) Values('Tom', 'tom@aol.com')

    Insert #TestTable (Name,Email) Values('Larry', 'larry@aol.com')

    Insert #TestTable (Name,Email) Values('Mary', 'mary@aol.com')

    Insert #TestTable (Name,Email) Values('Mary2', 'mary@aol.com')

    CREATE TABLE #TestTable2

    (

    Name varchar(50),

    Email varchar (40)

    )

    CREATE UNIQUE INDEX IX_TestTable2

    ON #TestTable2(Email);

    GO

    Truncate table #TestTable2

    -- This one gets an Error:

    -- Cannot insert duplicate key row in object 'dbo.#TestTable2' with unique index 'IX_TestTable2'.

    Insert #TestTable2 (Name, Email)

    Select Name, Email

    FROM #TestTable

    WHERE Email not in (Select Email FROM #TestTable2)

    -- Running these gets no error and only 3 get written to the file (as expected).

    Truncate table #TestTable2

    Insert #TestTable2 (Name, Email)

    Select Top 1 Name, Email

    FROM #TestTable

    WHERE Email not in (Select Email FROM #TestTable2)

    Insert #TestTable2 (Name, Email)

    Select Top 1 Name, Email

    FROM #TestTable

    WHERE Email not in (Select Email FROM #TestTable2)

    Insert #TestTable2 (Name, Email)

    Select Top 1 Name, Email

    FROM #TestTable

    WHERE Email not in (Select Email FROM #TestTable2)

    Insert #TestTable2 (Name, Email)

    Select Top 1 Name, Email

    FROM #TestTable

    WHERE Email not in (Select Email FROM #TestTable2)

    SELECT * from #TestTable

    SELECT * FROM #TestTAble2

    Thanks,

    Tom

  • You have to filter out "duplicate values" in your data set before you try to insert them into the destination table.

    You can do this by assigning a row number using the ROW_NUMBER() function. You can assign row numbers inside a partition, which is in this case the Email column. Then you keep only rows where the row number is 1.

    SELECT Name, Email

    FROM

    (

    SELECT Name, Email, rid = ROW_NUMBER() OVER (PARTITION BY Email ORDER BY Name)

    FROM #TestTable

    ) tmp

    WHERE rid = 1;

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply