Merge error handling

Question

Merge error handling

emailgregn

SSC-Addicted

Points: 462
More actions
November 1, 2010 at 9:54 pm

#228470

Comments posted to this topic are about the item Merge error handling

Viewing 14 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply

R.P.Rozema SSChampion Points: 12317 More actions · Answer 1

Good article, thank you!

One addition though: if your procedure is called from within a trigger or from a service broker handler, most likely the catch block won't be called upon many errors. Instead execution is resumed from the 1st line after the call to your procedure upon these errors. This is because in these contexts the xact_abort setting defaults to "on". To make sure your catch block does get executed for most (but still not all) errors, include a "set xact_abort off;" line as one of the first statements inside your procedure, before you open your try block. Adding this line won't harm your procedure in other contexts, but it makes sure your catch block will get executed even when called from these contexts.

Posting Data Etiquette - Jeff Moden[/url]
Posting Performance Based Questions - Gail Shaw[/url]
Hidden RBAR - Jeff Moden[/url]
Cross Tabs and Pivots - Jeff Moden[/url]
Catch-all queries - Gail Shaw[/url]

If you don't have time to do it right, when will you have time to do it over?

vishal.gamji SSCertifiable Points: 6126 More actions · Answer 2

What are your views on using this kind of logic in an ETL flow for slowly changing dimensions - overkill?. What about the performance implications when it goes into the slow path?

For OLTP applications, this looks like an AWESOME post. Can someone comment on ETL usage?

rockvilleaustin SSC-Addicted Points: 412 More actions · Answer 3

I have been troubled by identifying the error rows in the MERGE process so thanks for sharing this.

What if there are millions of rows to merge ... will this affect the performance?

emailgregn SSC-Addicted Points: 462 More actions · Answer 4

Performance on the error free "fast path" is great and I'm easily pushing through merges of 10k rows. On the slow path when there are errors, performace is going to be poor because of the row by row iteration. There are some performance improvements to be had, like putting an index on the UDDT but it's always going to be expensive.

Brian Carlson SSCarpal Tunnel Points: 4170 More actions · Answer 5

For large updates, index the update_id column to improve performance:

CREATE TYPE CustomerUpdatesType AS TABLE (

update_id int identity(1,1) not NULL PRIMARY KEY,

customer_id int null,

title nvarchar(16) null,

first_name nvarchar(32) null,

last_name nvarchar(32) null,

phone_number nvarchar(10) null

);

emailgregn SSC-Addicted Points: 462 More actions · Answer 6

R.P.Rozema (11/2/2010)
Good article, thank you!
One addition though: if your procedure is called from within a trigger or from a service broker handler, most likely the catch block won't be called upon many errors. Instead execution is resumed from the 1st line after the call to your procedure upon these errors. This is because in these contexts the xact_abort setting defaults to "on". To make sure your catch block does get executed for most (but still not all) errors, include a "set xact_abort off;" line as one of the first statements inside your procedure, before you open your try block. Adding this line won't harm your procedure in other contexts, but it makes sure your catch block will get executed even when called from these contexts.

I hadn't thought about that so thank you.

emailgregn SSC-Addicted Points: 462 More actions · Answer 7

vishal.gamji (11/2/2010)
What are your views on using this kind of logic in an ETL flow for slowly changing dimensions - overkill?. What about the performance implications when it goes into the slow path?
For OLTP applications, this looks like an AWESOME post. Can someone comment on ETL usage?

I'd imagine this would be a good fit for a lot of ETL situations, although I confess to using it in OLTP.

SQLRNNR SSC Guru Points: 281344 More actions · Answer 8

Thanks for sharing this article.

Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events

devora.fs Old Hand Points: 355 More actions · Answer 9

Thank you, very helpful.

Didn't think it through but might be better if we just process that one faulty row and then send all rows with id> id of error row.

Treehouse Ten Centuries Points: 1234 More actions · Answer 10

This is good stuff... exactly what I need to troubleshoot a strange unique key violation with a MERGE.

Question though- why is the IDENTITY column of the type an INT, while the @single_update_id param to the proc is a BIGINT? I am guess the two datatypes should match, depending on the size needed for the particular application.

Thanks!

emailgregn SSC-Addicted Points: 462 More actions · Answer 11

Matthew-399425 (11/4/2010)
Question though- why is the IDENTITY column of the type an INT, while the @single_update_id param to the proc is a BIGINT? I am guess the two datatypes should match, depending on the size needed for the particular application.

You're right, the datatypes should match and sizing depends on the application.

emailgregn SSC-Addicted Points: 462 More actions · Answer 12

Something else I recently learned that I left out of the article for clarity is about duplicate keys in the source data.

MERGE gets broken horribly if there are duplicate alternate keys in the <table_source> i.e. rows with the same customer_id in the @CustomerUpdates UDDT.

A validation step to detect / remove any duplicates before the merge statement can help catch otherwise hard to detect errors.

HTH

jpratt-797544 SSC Eights! Points: 915 More actions · Answer 13

This is a great post and a great demo of recursion capabilities in T-SQL/SPs. My current issue is the duplicate key/ID in the source data. What I am trying to do is get the recursive call to just skip to the next highest ID, but the MERGE OUTPUT apparently does not contain anything at all if an error occurs so I can't get the MAX(ID) from the output. In addition, I would rather not have to keep creating UDDTs and passing TVPs for every MERGE I need to do.

*sigh* It seems like another bright idea that Microsoft didn't implement very well (see postings online about duplicate key handling and other features of other RDBMSes).

I need single row error handling and logging, so I am going back to trusty cursors that loop through the IDs and then do a MERGE on each record processed. FYI, cursors perform great when the cursor declare only processes indexed IDs. Then you do the work retrieving other values inside the cursor loop.

J Pratt