SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Stairway to T-SQL DML Level 12: Using the MERGE Statement


Stairway to T-SQL DML Level 12: Using the MERGE Statement

Author
Message
Brian Jones-401554
Brian Jones-401554
SSChasing Mays
SSChasing Mays (616 reputation)SSChasing Mays (616 reputation)SSChasing Mays (616 reputation)SSChasing Mays (616 reputation)SSChasing Mays (616 reputation)SSChasing Mays (616 reputation)SSChasing Mays (616 reputation)SSChasing Mays (616 reputation)

Group: General Forum Members
Points: 616 Visits: 295
Great article - we've used MERGE for some time and it reduces the CRUD burden considerably.

However, it should not be used if you are using transaction replication. We've found (and there are others if you google) that replication does not recognise the MERGE statement correctly, and will issue a DELETE/INSERT instead of an UPDATE to the subscribers. This can then collide with any Foreign Key Constraints you may have set up, breaking the replication. Unfortunately this has meant we've had to remove the use of MERGE for a lot of our routines.
GSquared
GSquared
SSC Guru
SSC Guru (56K reputation)SSC Guru (56K reputation)SSC Guru (56K reputation)SSC Guru (56K reputation)SSC Guru (56K reputation)SSC Guru (56K reputation)SSC Guru (56K reputation)SSC Guru (56K reputation)

Group: General Forum Members
Points: 56223 Visits: 9730
paul.barbin (1/16/2013)
Curious to hear from others on whether they use the MERGE command in production and what it's performance is like.

We began using it in our unit test scripts to build data for developers immediately following the database build (we're using SSDT 2012). We wanted to use MERGE so the scripts could run even if developers already had data in their database.

So, rather than having an insert script of 10,000 records, we use the Merge command along with the 10,000 rows using the VALUES statement. This is EXTREMELY slow!
It takes 30 minutes to populate a small database (5GB) vs 2-3 minutes using TRUNCATE/INSERT.

Paul


I use Merge in production all over the place.

For example, I have a series of databases that power websites. Thousands of hits per day. The data for them gets created/updated/deleted in a CRM, then has to be loaded into the databases for the sites. Pretty complex business rules on the load processes, and servers thousands of miles apart from each other.

Prior DBA had it all set up as Truncate then Insert, but I modified it all to use Merge. Used to be, we'd have "blank" periods a couple of time per day, and employees got used to (a) fielding questions about that from customers, and (b) making changes and not having them show up till (sometimes) late the next day.

I modified it all to use Merge, and set it to run hourly. No "blank periods", and minimal waits for data to go live. Everyone loves it.

However, since Merge has to lock the tables that are being reloaded, and usually needs a full table-lock because of the scope of the action, it would have failed if I didn't combine it with Read Committed Snapshot Isolation. That allows queries from the sites to keep running and not be blocked by the large Merge actions.

Tested without that, and it was bad. (Was in a test environment, so no impact on anyone. But if Merge had gone live without that testing, it would have been worse than the prior "the tables are temporarily empty" issues.)

So, it can be pretty useful, but test against expected concurrent load.

Also tried more traditional two-step update/insert actions, but Merge performed better than those in my case. I've seen it take longer, however, so, again, test vs more traditional methods.

Edit: Truncate+Insert will be faster. It doesn't require any join math between the source and destination. That alone will make it faster. If speed is the only concern, then truncate+insert will almost certainly be faster. However, that option also means you can't have FKs referencing the table, and all the other issues that go along with truncate. But if speed is really all that matters, use it.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
paul.barbin
paul.barbin
Valued Member
Valued Member (59 reputation)Valued Member (59 reputation)Valued Member (59 reputation)Valued Member (59 reputation)Valued Member (59 reputation)Valued Member (59 reputation)Valued Member (59 reputation)Valued Member (59 reputation)

Group: General Forum Members
Points: 59 Visits: 106
@Brian, that delete/insert action in transactional replication is a known bug (or at least it used to be). We had that problem many moons ago and we weren't using the MERGE command. MS said that if the key field in an index is updated, transactional replication views that as a Delete/Insert pair. We had to have special stored procs on the subscriber to handle the error for us. When the Delete sp was called, we checked to see if the delete would fail due to FK constraints, if so, we determined that it was a "false" delete and we'd ignore the delete command. The following command would be the insert and we'd check to see if the record already existed. If so, we'd just update it with the values. That was way back in 2000 and 2005. No idea if that's corrected in the newer versions.

@Gsquared, thanks. I was just curious, I haven't used MERGE outside of this one case. In this particular case, speed isn't crucial, but the time difference is SO great that it starts to become an issue. What we wanted was the ability to run the script regardless of the state of the db and MERGE gives that ability where "kill and fill" doesn't. Good to know that when used in other applications that it is as fast or faster.

Thanks.
GSquared
GSquared
SSC Guru
SSC Guru (56K reputation)SSC Guru (56K reputation)SSC Guru (56K reputation)SSC Guru (56K reputation)SSC Guru (56K reputation)SSC Guru (56K reputation)SSC Guru (56K reputation)SSC Guru (56K reputation)

Group: General Forum Members
Points: 56223 Visits: 9730
There are some very solid, and pretty intuitive, ways to speed up Merge.

For my cross-server ETL, I set up a bulk-copy database that just copies the source tables from the other server. Very fast because it's minimally logged and doesn't incure DTC locks. Then I Merge from that. But that kind of "stage first, load second" isn't specific to Merge. I'd have done that no matter what my final choice was for loading the target tables, since it would apply equally to kill'n'fill, or update/insert (2-step), or anything else.

Another way to really speed up Merge is to make sure the On statement from source to target, is properly indexed, and (preferably) has index-enforced uniqueness for the source. For complex source queries, it can be faster to load a temp table, index that, then use Merge with that as the source.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
R.P.Rozema
R.P.Rozema
SSCrazy
SSCrazy (2.8K reputation)SSCrazy (2.8K reputation)SSCrazy (2.8K reputation)SSCrazy (2.8K reputation)SSCrazy (2.8K reputation)SSCrazy (2.8K reputation)SSCrazy (2.8K reputation)SSCrazy (2.8K reputation)

Group: General Forum Members
Points: 2760 Visits: 1716
@Ryan.Polk: Yes, you can do that, but it is not the same. First, as you just demonstrated, it is easy to forget that you need to add the same condition to both the "when not matched by source" and the "when matched" clauses. But on top of that, if you compare the query plans and the io statistics for both options, you will see that my approach with the cte performs better than your alternative. i.e. my cte approach is easier on the eye plus it performs better.

statistics io for cte approach:
Table 'suppliers'. Scan count 0, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'supplierproducts'. Scan count 1, logical reads 6, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#4C8B54C9'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.


statistics io for 2 Ryan.Polk method:
Table 'suppliers'. Scan count 0, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'supplierproducts'. Scan count 2, logical reads 13, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#4C8B54C9'. Scan count 2, logical reads 17, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.




Posting Data Etiquette - Jeff Moden
Posting Performance Based Questions - Gail Shaw
Hidden RBAR - Jeff Moden
Cross Tabs and Pivots - Jeff Moden
Catch-all queries - Gail Shaw


If you don't have time to do it right, when will you have time to do it over?
boriskey
boriskey
SSC-Enthusiastic
SSC-Enthusiastic (143 reputation)SSC-Enthusiastic (143 reputation)SSC-Enthusiastic (143 reputation)SSC-Enthusiastic (143 reputation)SSC-Enthusiastic (143 reputation)SSC-Enthusiastic (143 reputation)SSC-Enthusiastic (143 reputation)SSC-Enthusiastic (143 reputation)

Group: General Forum Members
Points: 143 Visits: 256
i also found that performance of MERGE is a bit worse than just INSERT/UPDATE - with indexes in place - it is not a huge difference but still noticeable.

Also it is hard to debug/troubleshoot MERGE statements - with INSERT/UPDATEs you can just select the SELECT part with joins and WHERE clause and run it to see results and play with them till you are ready to do actual inserts/updates.

With Merge statement, you would have to cut out the code from it and run on a side or use output clause which you have to type or uncomment.

One pain at least in 2008, that you cannot use results on OUTPUT clause outside of the MERGE (as a inner query) - you can only insert results to a table variable which you need to define and create first.

So after some first excitements our team is back on using old good UPSERTs.
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (209K reputation)SSC Guru (209K reputation)SSC Guru (209K reputation)SSC Guru (209K reputation)SSC Guru (209K reputation)SSC Guru (209K reputation)SSC Guru (209K reputation)SSC Guru (209K reputation)

Group: General Forum Members
Points: 209705 Visits: 41973
Greg,

Would you respond to the following post concerning your stairway. Thanks.
http://www.sqlservercentral.com/Forums/Topic1489336-90-1.aspx

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Vertigo44
Vertigo44
Right there with Babe
Right there with Babe (724 reputation)Right there with Babe (724 reputation)Right there with Babe (724 reputation)Right there with Babe (724 reputation)Right there with Babe (724 reputation)Right there with Babe (724 reputation)Right there with Babe (724 reputation)Right there with Babe (724 reputation)

Group: General Forum Members
Points: 724 Visits: 840
I would like to get started with these...

DML:
http://www.sqlservercentral.com/stairway/75773/

Advanced T-SQL:
http://www.sqlservercentral.com/stairway/104497/

The only thing I could not determine is which version of SQL Server Express I should install and what database these are written for (ie: Adventure Works)?

Thanks!
Iwas Bornready
Iwas Bornready
SSC-Insane
SSC-Insane (22K reputation)SSC-Insane (22K reputation)SSC-Insane (22K reputation)SSC-Insane (22K reputation)SSC-Insane (22K reputation)SSC-Insane (22K reputation)SSC-Insane (22K reputation)SSC-Insane (22K reputation)

Group: General Forum Members
Points: 22106 Visits: 885
I don't know that much about MERGE. It sort of scares me. We've managed to create all of our stored procedures without it.
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search