MERGE STATEMENT - what can be my biggest problem with it?

  • Why would I definitely not want to use MERGE statements - in what situations/scenarios? - and resort to old way of multiple updates/inserts.

    I looked at a couple of sites like this:  (and it looks vague... other than 'Merge statement can cause massive deadlocking'  (but even then, when it may and when it may not cause deadlocking, how can i predict?..)

    https://www.mssqltips.com/sqlservertip/3074/use-caution-with-sql-servers-merge-statement/

    What kind of worst issue have you experienced with using MERGE statement?

    THANKS!

    Likes to play Chess

  • Merge statements are good for "make these rows look like those rows" situations.  There are plenty of other uses too.  Unlike UPDATE, DELETE, etc. MERGE is not a single atomic operator.  The three things I would say are, 1) always use SET XACT_ABORT ON, and 2) always use a common table expression to locate records in the target table (there's no WHERE clause in a merge statement), and 3) make sure if there are triggers on the target table they only execute if rows are actually inserted/deleted.

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • The only problem I've had with merge is when more than 1 row in the source matches the target.  Example:

    • I have a source queuing up records in stage as records are updated.
    • The source fills up stage faster than my process to load into the target such that an entity record has more than 1 row in stage.

    If I issue a merge without handling the points mentioned, I get an error (below).  The way I handle it is to use the ROW_NUMBER function.  I partition on entity Id and sort by insert date.  That way I always process the oldest row in stage for each entity record.

    I hope that makes sense.

    The MERGE statement attempted to UPDATE or DELETE the same row more than once. This happens when a target row matches more than one source row. A MERGE statement cannot UPDATE/DELETE the same row of the target table multiple times. Refine the ON clause to ensure a target row matches at most one source row, or use the GROUP BY clause to group the source rows.

  • Thanks!  So if there is more than 1 record matching in SOURCE then MERGE picks a random one to insert?

    Do you use ROW_NUMBER inside of CTE in the Source?

    Likes to play Chess

  • If there are more than matches, SQL Server throws an error.

    Here is what I do:

    • The rows load into stage.  At this point I don't care how many rows an individual source record has.  Each row represents a change to the source record.
    • In stage, there is a column named LoadGUID (uniqueidentifier).  The column defaults to NULL.
    • When my loading process executes, the first step is to "grab" a batch of rows from stage by assigning a LoadGUID (code below).
    • Next, the LoadGUID is passed to the appropriated entity load procedure and the batch of rows is processed.
    update a set a.LoadGUID = @LoadGUID
    from (
    select *, row_number() over (partition by CandidateId order by StageLoadDateTime asc) as RowNum
    from Stage.DimCandidates
    where LoadGUID is null
    ) a
    where a.RowNum = 1;
  • It always made me nervous to use MERGE since MS put a disclaimer on the KB page for MERGE (https://docs.microsoft.com/en-us/sql/t-sql/statements/merge-transact-sql?view=sql-server-2017).

    I work in a OLTP shop. We have had issues where someone creates a complex MERGE statement that runs hundreds of times per second and it caused a lot of blocking as well as consumes a large number of resources on the server.

    While I like the idea of MERGE, it's still safer to use normal DML statements IMO.

  • I've found performance to be the biggest problem with a MERGE. If you can replace a merge with an update followed by an insert it will give about 60% better performance. Microsoft even describe this on the BOL for MERGE: https://docs.microsoft.com/en-us/sql/t-sql/statements/merge-transact-sql?view=sql-server-2017

    Performance Tip: The conditional behavior described for the MERGE statement works best when the two tables have a complex mixture of matching characteristics. For example, inserting a row if it doesn't exist, or updating a row if it matches. When simply updating one table based on the rows of another table, improve the performance and scalability with basic INSERT, UPDATE, and DELETE statements. 

     

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply