Ignore Duplicate Keys

  • What's the best way to ignore attempts to insert dups into a primary key? I definitely want them rejected, BUT, I don't want the sproc that's inserting the records to terminate with an error, I just want it to keep going.

    So if I have a table with a primary key of HistTranId.

    and a stored procedure with an insert statement:

    create proc...

    insert into TranTable(HistTranId) VALUES(111)

    insert into TranTable(HistTranId) VALUES(222)

    insert into TranTable(HistTranId) VALUES(111)

    insert into TranTable(HistTranId) VALUES(444)

    I just want the sproc to ignore the dup 111 and keep going. Should I trap the error and ignore it, or is there an easier way?

    SQL2008

    I

    .

  • Hi

    Primary key is always unique, as soon as you try to insert duplicate data, you get an error.

    Options:

    • Create an INSTEAD OF trigger and filter data within the trigger
    • Don't insert duplicates 😉
    • Call a second proc within your first which inserts only one row. Call this proc and surround the INSERT statement with a TRY CATCH block

    Greets

    Flo

  • Use the IGNORE_DUP_KEY = ON option, see BOL.

    Regards,

    Willem
    http://wschampheleer.wordpress.com[/url]

  • Flo's second option (don't insert duplicates) is normally by far the best option of those presented so far.

    The best way to achieve that depends on factors which you have not yet shared with us (unless the procedure in question really is a series of single-row INSERTs with hard-coded values...?!)

    If you can give some more background to this, you'll probably get a better answer.

    Some questions:

    1. What is the original source of the data to be added (a data file/query results)?

    2. How much data is there?

    3. How often is the process run?

    4. Are the duplicates exclusively in the new data, or might a new row conflict with existing data?

  • IGNORE_DUP_KEY seems like just the ticket! Preventing the problem in the first place would be best, but I don't have complete control over that part. Occasional dups are slipping through and causing problems. I prefer to bury my head in the sand on this one, and ignore_dup_key is perfect! 😀

    .

  • There is an issue with the logic of this solution;

    You said you don't want to stop the error but want the process to keep running if it does error. The suggestion of Ignore_dup_Key will allow the duplicate records to be entered, meaning at some point you will still have to come back and clean up the data which, in the long run will be more work than the other recommended solutions.

    Link to my blog http://notyelf.com/

  • Really? According to BOL, the offending rows fail. Am I missing something? From BOL:

    "IGNORE_DUP_KEY = { ON | OFF }

    Specifies the error response to duplicate key values in a multiple-row insert operation on a unique clustered or unique nonclustered index. The default is OFF.

    ON

    A warning message is issued and only the rows violating the unique index fail.

    OFF

    An error message is issued and the entire INSERT transaction is rolled back."

    Thanks!

    .

  • IGNORE_DUP_KEY is an index option which means that you have to define it when you create your unique index or primary key.

    I would recommend leaving the default option of IGNORE_DUP_KEY = OFF and go with Flo's second option, don't insert duplicates.

    John Rowan

    ======================================================
    ======================================================
    Forum Etiquette: How to post data/code on a forum to get the best help[/url] - by Jeff Moden

  • IGNORE_DUP_KEY seems like a magic bullet doesn't it? Be aware though that nothing is for free. When this option is enabled on an index, SQL Server has to include extra logic in query plans to handle it. Frequently, you can end up with very poor plans for a query that INSERTs or UPDATEs data that affects that index. If you do decide to go with IGNORE_DUP_KEY (despite all the advice to the contrary), be sure to check all plans that affect the table in question, particularly if the query modifies data contained in the index.

    You really should do something better here.

  • Attached is a sample of what I'm trying to do.

    .

  • I assume:

    INSERT INTO @VendorExport

    SELECT * FROM @Transactions

    WHERE NOT EXISTS(SELECT TranId FROM @Transactions)

    should say:

    INSERT INTO @VendorExport

    SELECT * FROM @Transactions T

    WHERE NOT EXISTS(SELECT * FROM @VendorExport VE WHERE VE.TranId = T.TranId)

    ...?

  • In any case, here are a couple of alternatives:

    -- New Transactions

    CREATE TABLE #Transactions

    (

    TranIdBIGINT PRIMARY KEY,

    TranDateDATETIME,

    TranCodeVARCHAR(4),

    AmountMONEY

    );

    -- Existing Transactions

    CREATE TABLE #VendorExport

    (

    TranIdBIGINT PRIMARY KEY,

    TranDateDATETIME,

    TranCodeVARCHAR(4),

    AmountMONEY

    );

    -- Generate one million existing transactions (takes about three seconds)

    WITH Data (rn)

    AS (

    SELECT TOP (1000000)

    ROW_NUMBER() OVER (ORDER BY C1.object_id, C1.column_id)

    FROM master.sys.columns C1,

    master.sys.columns C2,

    master.sys.columns C3

    )

    INSERT #VendorExport(TranID, TranDate, TranCode, Amount)

    SELECT rn,

    DATEADD(HOUR, Data.rn, '19950101'),

    CASE rn % 2 WHEN 0 THEN 'room' WHEN 1 THEN 'tax' ELSE NULL END,

    RAND(CHECKSUM(NEWID())) * 25 + 1

    FROM Data;

    -- Generate new data, with some overlaps

    WITH Data (rn)

    AS (

    SELECT TOP (10000)

    ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    FROM master.sys.columns C1,

    master.sys.columns C2

    )

    INSERT #Transactions (TranId, TranDate, TranCode, Amount)

    SELECT rn * 100,

    DATEADD(HOUR, Data.rn, '19950101'),

    CASE rn % 2 WHEN 0 THEN 'room' WHEN 1 THEN 'tax' ELSE NULL END,

    rn % 25

    FROM Data;

    -- Add records which don't already exist (method 1)

    MERGE #VendorExport E

    USING #Transactions T

    ON T.TranID = E.TranID

    WHEN NOT MATCHED THEN

    INSERT (TranID, TranDate, TranCode, Amount)

    VALUES (T.TranID, T.TranDate, T.TranCode, T.Amount);

    -- Add records which don't already exist (method 2)

    INSERT #VendorExport

    (TranID, TranDate, TranCode, Amount)

    SELECT T.TranId, T.TranDate, T.TranCode, T.Amount

    FROM #Transactions T

    WHERE NOT EXISTS

    (

    SELECT *

    FROM #VendorExport VE

    WHERE VE.TranId = T.TranId

    );

    -- Add records which don't already exist (method 3)

    INSERT #VendorExport

    (TranID, TranDate, TranCode, Amount)

    SELECT TranID, TranDate, TranCode, Amount

    FROM #Transactions

    EXCEPT

    SELECT TranID, TranDate, TranCode, Amount

    FROM #VendorExport

    -- Tidy up

    DROP TABLE

    #Transactions,

    #VendorExport;

  • Thanks Paul! I will definitely give these alternatives a shot and run some benchmarks. Like you said though, the IGNORE_DUP_KEY sure SEEMS like a silver bullet on the surface!

    MERGE and EXCEPT are two new ones on me. It will be good to add those to my bag of tricks.

    .

  • You are welcome. I particularly like the MERGE syntax - and it produces an arguably neater plan, avoiding a table spool. Do let us know how you get on.

  • if u have primary key on table then just the trap the error by

    @@ERROR

    and generate error with RAISERROR ('Duplicate Record')

Viewing 15 posts - 1 through 15 (of 21 total)

You must be logged in to reply to this topic. Login to reply