Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««123»»

Big Performance Problem with Merge Statement Expand / Collapse
Author
Message
Posted Saturday, June 22, 2013 1:02 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 5:03 PM
Points: 35,954, Visits: 30,243
David Moutray (6/21/2013)
Nothing? I was sure that one of those brilliant SQL veterans like Gail Shaw, Jeff Moden, Grant Fritchey, Steve Jones or Adam Machanic would have some insight into this problem.

Help me, Obi-Wan, you're my only hope!


To be absolutely honest, I've heard/read of several problems with MERGE and so never made the "leap" to use it in SQL Server. The only time I've used MERGE was way-back-when I had to use Oracle. The only reason I even considered using it then was because UPDATE in Oracle doesn't have a FROM clause and it was a whole lot simpler to use MERGE as a surrogate for UPDATEs there.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

"Change is inevitable. Change for the better is not." -- 04 August 2013
(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1466441
Posted Sunday, June 23, 2013 8:51 AM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Today @ 8:58 PM
Points: 11,168, Visits: 10,930
The main issue is the cardinality estimation for the Filter in the MERGE plan: estimated rows 32 million, actual 10 thousand.

The sorts in the plan require a memory grant, which is sized based on the estimated number of rows. Once the merge update actions have been Split into separate deletes and inserts, the estimate at the sort is 57.5 million rows. Ironically, none of the non-clustered indexes are changed in the example plan you gave, so the sorts actually encounter zero rows!

As you may know, queries cannot start executing until workspace memory has been granted. Your example MERGE plan ended up acquiring over 11GB of memory (!) though it may have asked for even more to begin with. It is highly likely the query had to wait a considerable time before 11GB could be granted. You can monitor memory grants using sys.dm_exec_query_memory_grants. If you had been using SQL Server 2012, the wait time would also have been reported in the execution plan.

The wait for memory grant is the reason the 80-row run took so long, and the wildly inaccurate cardinality estimate is the reason for the ridiculous memory grant.

The cardinality estimate is wrong because this Filter eliminates rows for which no merge action (insert, update, or delete) is necessary. The unusual structure of the MERGE statement makes estimating the number of rows that require no changes very difficult, and the cardinality estimator gets it very wrong. The offending part of the MERGE statement is:

WHEN NOT MATCHED BY SOURCE
AND Production.ParticipantID IN ( SELECT ParticipantID FROM ETL.ParticipantResponseBuild )

I understand what you are trying to do here, and why you wrote this clause this way but that doesn't change the fact that it is unusual and makes an already-difficult cardinality estimation problem just about impossible. To the optimizer, it looks very much as if almost all rows will result in an insert, update, or delete.

If you cannot reformulate the MERGE to handle the DELETE option using more transparent logic, use the MERGE for the INSERT and UPDATE and perform a separate DELETE. Or just use the three separate statements, of course, if you find that performs better. I would not say "never use MERGE" but it can require skilled tuning in many cases, and is rarely the best choice where parallel execution is needed. Cardinality estimation is much easier in the case of separate statements.

Other things, if you are interested:

MERGE is not optimized for large set processing, despite the common intuition that one statement ought to be faster than three. On the other hand, MERGE does contain some very specific optimizations for OLTP (in particular, the elimination of Halloween Protection in certain cases). There are particular considerations for MERGE that make it unsuitable for large sets. Some of these are due to implementation decisions, some are optimizer limitations, and others just come down to the immense complexity of the merging operation itself.

The Compute Scalar that determines the merge action is a Compute Sequence. This operator cannot tolerate parallel execution, so a parallel MERGE plan will stop and start parallel execution either side of it. The costs of stopping and restarting often result in MERGE plans that do not use parallelism, where it might otherwise be expected.

You can improve the cardinality estimates in some areas of the plans by adding OPTION (RECOMPILE) to the queries that reference the local variable @LastKeyCompleted. This hint allows the optimizer to see the current value of the variable and optimize accordingly. Otherwise, the plan is based on a guess. Recompiling may not take long here compared to the run time of the query, so it could be a price worth paying to get a plan tailored to the current value of @LastKeyCompleted.

The construct OUTER JOIN ... WHERE NULL is almost never preferable to writing a NOT EXISTS. The logical requirement is an Anti Semi Join, performing a full join and then rejecting NULLs is an odd way to express it. This topic has been written about many times, always with the same results. Use NOT EXISTS.

HOLDLOCK = SERIALIZABLE. There are good reasons to use this with MERGE for highly concurrent systems, but it would be quite unusual for ETL. Be sure you need this hint.




Paul White
SQL Server MVP
SQLblog.com
@SQL_Kiwi
Post #1466528
Posted Monday, June 24, 2013 1:32 AM


SSCarpal Tunnel

SSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal Tunnel

Group: General Forum Members
Last Login: Today @ 11:50 AM
Points: 4,914, Visits: 8,652
Paul, your answers could be collected in a book and it would be an absolute best-seller!
Awesome, as usual.


--
Gianluca Sartori

Get your two-cent-answer quickly
spaghettidba.com
@spaghettidba
Post #1466610
Posted Monday, June 24, 2013 2:10 AM


SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Tuesday, April 08, 2014 6:13 AM
Points: 1,694, Visits: 19,550
spaghettidba (6/24/2013)
Paul, your answers could be collected in a book and it would be an absolute best-seller!
Awesome, as usual.


+1


____________________________________________________

How to get the best help on a forum

http://www.sqlservercentral.com/articles/Best+Practices/61537

Never approach a goat from the front, a horse from the rear, or a fool from any direction.
Post #1466620
Posted Monday, June 24, 2013 10:30 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 5:03 PM
Points: 35,954, Visits: 30,243
spaghettidba (6/24/2013)
Paul, your answers could be collected in a book and it would be an absolute best-seller!
Awesome, as usual.


+1000 to that. What would be even better is if MS really started listening to him.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

"Change is inevitable. Change for the better is not." -- 04 August 2013
(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1466833
Posted Monday, June 24, 2013 12:31 PM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Today @ 11:03 AM
Points: 218, Visits: 769
I'll second everyone else's comments: fascinating stuff, Paul. As I was saying, my "problem" is resolved, but I was really interested to understand why SQL was behaving that way. So, thank you, very much. You make it very clear.

I will be particularly interested to play around with using NOT EXISTS logic where we currently use WHERE IS NOT NULL.

Post #1466861
Posted Monday, June 24, 2013 12:42 PM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Today @ 8:58 PM
Points: 11,168, Visits: 10,930
David Moutray (6/24/2013)
I will be particularly interested to play around with using NOT EXISTS logic where we currently use WHERE IS NOT NULL.

If you feel like reading up on it a bit before diving in, these two articles are my favourites:

http://www.sqlperformance.com/2012/12/t-sql-queries/left-anti-semi-join
http://explainextended.com/2009/09/15/not-in-vs-not-exists-vs-left-join-is-null-sql-server/




Paul White
SQL Server MVP
SQLblog.com
@SQL_Kiwi
Post #1466866
Posted Monday, June 24, 2013 12:58 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 5:03 PM
Points: 35,954, Visits: 30,243
Couple of other good ones on the same subject...

http://sqlinthewild.co.za/index.php/2010/02/18/not-exists-vs-not-in/
http://sqlinthewild.co.za/index.php/2010/03/23/left-outer-join-vs-not-exists/


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

"Change is inevitable. Change for the better is not." -- 04 August 2013
(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1466870
Posted Monday, June 24, 2013 1:13 PM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Today @ 8:58 PM
Points: 11,168, Visits: 10,930

Oh yes, of course. Thanks!




Paul White
SQL Server MVP
SQLblog.com
@SQL_Kiwi
Post #1466873
Posted Tuesday, June 25, 2013 1:02 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Today @ 9:33 AM
Points: 6,754, Visits: 12,854
spaghettidba (6/24/2013)
Paul, your answers could be collected in a book and it would be an absolute best-seller!
Awesome, as usual.


Absolutely. Paul, you don't often post on ssc these days, but when you do it's a real blockbuster of a learning opportunity for us mere mortals. Thank you!


“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Exploring Recursive CTEs by Example Dwain Camps
Post #1467003
« Prev Topic | Next Topic »

Add to briefcase ««123»»

Permissions Expand / Collapse