Is MERGE always the way to go for upserts?

  • I find myself always wanting to use MERGE. Some of the other DBA's aren't so sure about this and would prefer the IF EXISTS UPDATE ELSE INSERT approach.

    Should I stick with the MERGE or use the pre-MERGE upsert for single rows?

    Thoughts?

    Edit: Probably should have posted this in TSQL. My first time posting on sqlservercentral.

  • Merge's power is in the single transaction / modification capability for both sides of the Upsert, which implies multiple rows, and not having to double-pass the data.

    By doing a select and making a decision once , you leave the opportunity for multiple simultaneous calls to step over each other. Not a big deal 99.9% of the time as you're usually working of IDs or something for the Update vs. Insert pieces, so two records from two users simultaneously are never the exact same data. However, if you're using business keys as your primary locator, this MIGHT (incredibly rarely) cause an issue on unique indexes/PKs.

    That said, for a single row, 6 to 1, half a dozen to the other. One double passes the data in two transactions (and different locking requirements) with a read and then a change, the other double passes the data in a single transaction and then performs the task. There's not going to be enough of a time difference for it to matter.

    For the sake of consistency, however, I'd stick with MERGE for all upserts, so that you don't have different standards and so that further growth or modification doesn't force a code rewrite.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Hi Craig,

    Can you help me understand this better? I would not be able to explain it to my rubber ducky for sure, if I had one 🙂 I don't understand the difference... For example:

    IF EXISTS (SELECT * FROM Product WHERE id = 6)

    BEGIN

    UPDATE Product

    SET ProductName = 'name 6'

    WHERE id = 6

    END

    ELSE

    BEGIN

    INSERT INTO Product

    SELECT 6, 'name 6'

    END

    What would be the equivalent MERGE command and is there some difference between example above and that corresponding MERGE statement?

    Thank you!

  • IF EXISTS (SELECT * FROM Product WHERE id = 6)

    BEGIN

    MERGE Product AS Tgrt

    USING Product AS Src

    ON Trgt.id = Src.id

    WHEN MATCHED THEN

    UPDATE Product

    SET ProductName = 'name 6'

    WHEN NOT MATCHED THEN

    INSERT INTO Product

    SELECT 6, 'name 6'

    END

    Above would be the equivalent code with MERGE. But this would not be the typical situation to use MERGE. You could do with what you had earlier.

  • I'd do it more or less like so:

    CREATE TABLE #Product (Id INT, ProductName VARCHAR(50))

    DECLARE @Id INT=6, @ProductName VARCHAR(50)='name 6'

    ;MERGE #Product AS Tgt

    USING (SELECT @Id, @ProductName) AS Src (Id, ProductName)

    ON Tgt.Id = Src.Id

    WHEN MATCHED AND Tgt.ProductName<>Src.ProductName THEN

    UPDATE SET ProductName=Src.ProductName

    WHEN NOT MATCHED THEN

    INSERT (Id, ProductName)

    VALUES (Src.Id, Src.ProductName);

    SELECT * FROM #Product

  • Oksana March (6/16/2012)


    Hi Craig,

    Can you help me understand this better? I would not be able to explain it to my rubber ducky for sure, if I had one 🙂 I don't understand the difference... For example:

    No worries.

    The code posted above will walk you through the 'how'. Sorry, I don't have permanent access to 2k8 at the moment, only when I'm at the office for one of our test boxes so I couldn't confirm I had the right code over the weekend.

    Anyway, let's take a look at the short difference:

    MERGE:

    Merge begins by opening shared locks on the records it needs, then escalating those locks to exclusive during the same transaction for the updatable items. This means between the opening and update nothing can update/remove the item, as the shared lock stays in existance and the row is also already located for the update. Similar to how an Update works by itself in isolation, you're just including the insert step at the same time.

    Now, let's look below:

    IF EXISTS (SELECT * FROM Product WHERE id = 6)

    Step 1: Open a transaction and find the record. Confirm existance. Release all locks and close the transaction. Assuming it's found.

    --- Unsaid: Other things can touch the row before the next piece activates. It's in milliseconds but get enough volume and things waiting for lock releases and you can get interrupted.

    BEGIN

    UPDATE Product

    SET ProductName = 'name 6'

    WHERE id = 6

    END

    Step 2: Open up a NEW transaction that will have to relocate the row and overwrite the ProductName information.

    This might not seem like much for a simple query, but the simple matter of having to locate the row twice will start to eat processing time eventually. Will heavily depend on call volume.

    Where it really hurts, and multi-transactions can pound on you, is when you start doing back to back joins off a staging table or similar, one for updates where the record is found and one with a left join IS NULL to locate inserts. Besides possible gaps in the update vs. insert components (or doing inserts then updates, meaning you'll doubletap your insert records) due to other inserts occurring, you're re-seeking (or worse, scanning) the same data.

    What would be the equivalent MERGE command and is there some difference between example above and that corresponding MERGE statement?

    Thank you!

    To have the statement above and MERGE to correspond transactionally, you'd have to wrap the entire thing in a Transaction, Repeatable read at the least, Serializable by preference. Hopefully that helps, if not let me know where I'm not communicating well.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Craig hit all the salient points here, as usual. Concurrency is the KILLER in UPSERT scenarios. You CAN handle it yourself, but in order to do so with multiple statements you MUST take care to handle concurrent access problems that hare inherently handled by MERGE under the covers.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply