Merge error handling

  • emailgregn

    SSC-Addicted

    Points: 462

    Comments posted to this topic are about the item Merge error handling

  • R.P.Rozema

    SSChampion

    Points: 12300

    Good article, thank you!

    One addition though: if your procedure is called from within a trigger or from a service broker handler, most likely the catch block won't be called upon many errors. Instead execution is resumed from the 1st line after the call to your procedure upon these errors. This is because in these contexts the xact_abort setting defaults to "on". To make sure your catch block does get executed for most (but still not all) errors, include a "set xact_abort off;" line as one of the first statements inside your procedure, before you open your try block. Adding this line won't harm your procedure in other contexts, but it makes sure your catch block will get executed even when called from these contexts.



    Posting Data Etiquette - Jeff Moden[/url]
    Posting Performance Based Questions - Gail Shaw[/url]
    Hidden RBAR - Jeff Moden[/url]
    Cross Tabs and Pivots - Jeff Moden[/url]
    Catch-all queries - Gail Shaw[/url]


    If you don't have time to do it right, when will you have time to do it over?

  • vishal.gamji

    SSCertifiable

    Points: 6126

    What are your views on using this kind of logic in an ETL flow for slowly changing dimensions - overkill?. What about the performance implications when it goes into the slow path?

    For OLTP applications, this looks like an AWESOME post. Can someone comment on ETL usage?

  • rockvilleaustin

    SSC-Addicted

    Points: 412

    I have been troubled by identifying the error rows in the MERGE process so thanks for sharing this.

    What if there are millions of rows to merge ... will this affect the performance?

  • emailgregn

    SSC-Addicted

    Points: 462

    Performance on the error free "fast path" is great and I'm easily pushing through merges of 10k rows. On the slow path when there are errors, performace is going to be poor because of the row by row iteration. There are some performance improvements to be had, like putting an index on the UDDT but it's always going to be expensive.

  • Brian Carlson

    SSCarpal Tunnel

    Points: 4154

    For large updates, index the update_id column to improve performance:

    CREATE TYPE CustomerUpdatesType AS TABLE (

    update_id int identity(1,1) not NULL PRIMARY KEY,

    customer_id int null,

    title nvarchar(16) null,

    first_name nvarchar(32) null,

    last_name nvarchar(32) null,

    phone_number nvarchar(10) null

    );

  • emailgregn

    SSC-Addicted

    Points: 462

    R.P.Rozema (11/2/2010)


    Good article, thank you!

    One addition though: if your procedure is called from within a trigger or from a service broker handler, most likely the catch block won't be called upon many errors. Instead execution is resumed from the 1st line after the call to your procedure upon these errors. This is because in these contexts the xact_abort setting defaults to "on". To make sure your catch block does get executed for most (but still not all) errors, include a "set xact_abort off;" line as one of the first statements inside your procedure, before you open your try block. Adding this line won't harm your procedure in other contexts, but it makes sure your catch block will get executed even when called from these contexts.

    I hadn't thought about that so thank you.

  • emailgregn

    SSC-Addicted

    Points: 462

    vishal.gamji (11/2/2010)


    What are your views on using this kind of logic in an ETL flow for slowly changing dimensions - overkill?. What about the performance implications when it goes into the slow path?

    For OLTP applications, this looks like an AWESOME post. Can someone comment on ETL usage?

    I'd imagine this would be a good fit for a lot of ETL situations, although I confess to using it in OLTP.

  • SQLRNNR

    SSC Guru

    Points: 281210

    Thanks for sharing this article.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • devora.fs

    SSC Veteran

    Points: 275

    Thank you, very helpful.

    Didn't think it through but might be better if we just process that one faulty row and then send all rows with id> id of error row.

  • Treehouse

    Ten Centuries

    Points: 1234

    This is good stuff... exactly what I need to troubleshoot a strange unique key violation with a MERGE.

    Question though- why is the IDENTITY column of the type an INT, while the @single_update_id param to the proc is a BIGINT? I am guess the two datatypes should match, depending on the size needed for the particular application.

    Thanks!

  • emailgregn

    SSC-Addicted

    Points: 462

    Matthew-399425 (11/4/2010)


    Question though- why is the IDENTITY column of the type an INT, while the @single_update_id param to the proc is a BIGINT? I am guess the two datatypes should match, depending on the size needed for the particular application.

    You're right, the datatypes should match and sizing depends on the application.

  • emailgregn

    SSC-Addicted

    Points: 462

    Something else I recently learned that I left out of the article for clarity is about duplicate keys in the source data.

    MERGE gets broken horribly if there are duplicate alternate keys in the <table_source> i.e. rows with the same customer_id in the @CustomerUpdates UDDT.

    A validation step to detect / remove any duplicates before the merge statement can help catch otherwise hard to detect errors.

    HTH

  • jpratt-797544

    SSC Eights!

    Points: 847

    This is a great post and a great demo of recursion capabilities in T-SQL/SPs. My current issue is the duplicate key/ID in the source data. What I am trying to do is get the recursive call to just skip to the next highest ID, but the MERGE OUTPUT apparently does not contain anything at all if an error occurs so I can't get the MAX(ID) from the output. In addition, I would rather not have to keep creating UDDTs and passing TVPs for every MERGE I need to do.

    *sigh* It seems like another bright idea that Microsoft didn't implement very well (see postings online about duplicate key handling and other features of other RDBMSes).

    I need single row error handling and logging, so I am going back to trusty cursors that loop through the IDs and then do a MERGE on each record processed. FYI, cursors perform great when the cursor declare only processes indexed IDs. Then you do the work retrieving other values inside the cursor loop.

Viewing 14 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply