How To Avoid MERGE in SQL Server

Question

How To Avoid MERGE in SQL Server

Viewing 12 posts - 16 through 27 (of 27 total)

You must be logged in to reply to this topic. Login to reply

Jonathan AC Roberts SSCoach Points: 18407 More actions · Answer 1

andrei_solntsev wrote:

Sorry, but this script won't work well for large tables, especially with heavy use of IN() and NOT IN() operators. MERGE is usually faster than series of consecutive INSERT/UPDATE/DELETE statements.

I've found the opposite, particularly for MERGE statements that just do INSERT/UPDATE, that putting the SQL in separate UPDATE/INSERT statements is much faster than a MERGE.

Thomas Franz Hall of Fame Points: 3930 More actions · Answer 2

I agree - there is no real reason, to avoid MERGE. And if it seems to difficult to understand, you may need just some practice

I use it very often on very large tables in our DWH (~20 TB) and if I would replace it by multiple INSERT / UPDATE / DELETE statements, everything would run slower.

I know, there are several - many year old - articles, that say, that MERGE is buggy, but if you check the bug lists, it are only very special edge cases (e.g. using MERGE to delete something from tbl1, using the OUTPUT option *without* INTO but putting the whole MERGE in a subselect INTO tbl1, which is something nobody really would consider to do) :

SELECT sub.*
  INTO tbl1
  FROM (MERGE INTO tbl1 AS t
        USING ... AS s
           ON s.key1 = t.key1
         WHEN ...
         WHEN NOT MATCHED
         THEN DELETE
        OUTPUT $action, Deleted.*
        ) as sub

God is real, unless declared integer.

Jonathan AC Roberts SSCoach Points: 18407 More actions · Answer 3

Here is a test that shows using a separate UPDATE/INSERT is faster than a MERGE.

SET STATISTICS IO, TIME OFF

if object_id('dbo.t1','U') IS NOT NULL drop table dbo.t1;
if object_id('dbo.t2','U') IS NOT NULL drop table dbo.t2;
go
create table dbo.t1(id int primary key clustered not null , value varchar(20)) ;
create table dbo.t2(id int primary key clustered not null , value varchar(20)) ;

insert into dbo.t1
select t.N, SUBSTRING(CONVERT(varchar(40), newid()), 1, 10)
from dbo.fnTally(1,10000000) t;

insert into dbo.t2
select t.N, SUBSTRING(CONVERT(varchar(40), newid()), 1, 20)
from dbo.fnTally(1,11000000) t;

SET STATISTICS IO, TIME ON
GO

BEGIN TRANSACTION
GO
PRINT '************************************* Start Merge'
MERGE 
 INTO dbo.t1 AS tgt  
USING dbo.t2 as src 
   ON src.id = tgt.id
 WHEN MATCHED THEN
      UPDATE SET tgt.value = src.value
 WHEN NOT MATCHED THEN  
      INSERT (id, value) VALUES (src.id, src.value)
; 
PRINT '************************************* End Merge'
GO
ROLLBACK
GO
BEGIN TRANSACTION
GO
PRINT '************************************* Start Upsert'
UPDATE x
   SET x.value = y.value
  FROM dbo.t1 x
 INNER JOIN dbo.t2 y
         ON y.id = x.id
;

INSERT INTO t1(id, Value)
SELECT x.Id, x.Value
  FROM dbo.t2 x
 WHERE NOT EXISTS(SELECT * 
                    FROM dbo.t1 y 
                   WHERE y.id = x.id)
;
PRINT '************************************* End Upsert'
GO
ROLLBACK;

************************************* Start Merge

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.
Table 't1'. Scan count 1, logical reads 30080903, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 't2'. Scan count 1, logical reads 30004, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.

 SQL Server Execution Times:
   CPU time = 24266 ms,  elapsed time = 25164 ms.

(11000000 rows affected)
************************************* End Merge

************************************* Start Upsert

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.
Table 't1'. Scan count 1, logical reads 36299, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 't2'. Scan count 1, logical reads 27286, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.

 SQL Server Execution Times:
   CPU time = 11015 ms,  elapsed time = 11272 ms.

(10000000 rows affected)
Table 't1'. Scan count 1, logical reads 3223826, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 't2'. Scan count 1, logical reads 30004, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'Worktable'. Scan count 1, logical reads 2883241, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.

 SQL Server Execution Times:
   CPU time = 5469 ms,  elapsed time = 5500 ms.

(1000000 rows affected)
************************************* End Upsert

The MERGE took 25 seconds compared to 16 seconds for the UPSERT.

dbo.fnTally

Glen Cooper SSCommitted Points: 1927 More actions · Answer 4

Thank you everybody for your helpful comments.

A little context here.

As a database consultant I was asked by my client to perform a merge on 27 tables, one of which had 700 columns. Having never used MERGE before, I assumed (correctly or not) that snippets of code would be required for all these columns. To avoid this, I wrote a simple merge which knew nothing about the columns and it worked fine. Later, when it became apparent that MERGE in a SSIS package was required (corporate standard) I wrote a proc to auto-generate all the merges, and semi-automated the package as well. I have no issue with MERGE even though I've never written one so the article's title was a bit inflammatory. But it was a lot of fun writing my own merge.

carsten.saastamoinen Old Hand Points: 341 More actions · Answer 5

Here is a test that shows using a separate MERGE is faster than a UPDATE/INSERT.

If you change your data to a more realistic scenario, you will come to the opposite conclusion. It is not - for most people - realistic to update all existing 10,000,000 rows. With 1,000,000 updates and 1,000,000 inserts, MERGE 'wins'. Conclusions can only be made if the number of rows is known and is realistic.

************************************* Start Merge

SQL Server Execution Times:

CPU time = 18250 ms, elapsed time = 20696 ms.

(2000001 rows affected)

************************************* End Merge

************************************* Start Upsert

SQL Server Execution Times:

CPU time = 14579 ms, elapsed time = 15593 ms.

(1000001 rows affected)SQL Server Execution Times:

CPU time = 12112 ms, elapsed time = 11685 ms.

(1000000 rows affected)

************************************* End Upsert

Jonathan AC Roberts SSCoach Points: 18407 More actions · Answer 6

carsten.saastamoinen wrote:

Here is a test that shows using a separate MERGE is faster than a UPDATE/INSERT.
If you change your data to a more realistic scenario, you will come to the opposite conclusion. It is not - for most people - realistic to update all existing 10,000,000 rows. With 1,000,000 updates and 1,000,000 inserts, MERGE 'wins'. Conclusions can only be made if the number of rows is known and is realistic.
************************************* Start Merge SQL Server Execution Times: CPU time = 18250 ms, elapsed time = 20696 ms.
(2000001 rows affected) ************************************* End Merge ************************************* Start Upsert SQL Server Execution Times: CPU time = 14579 ms, elapsed time = 15593 ms.
(1000001 rows affected)SQL Server Execution Times: CPU time = 12112 ms, elapsed time = 11685 ms.
(1000000 rows affected) ************************************* End Upsert

Where is the test??

carsten.saastamoinen Old Hand Points: 341 More actions · Answer 7

carsten.saastamoinen

Old Hand

Points: 341

February 22, 2023 at 12:58 am

#4152548

I use your script and just show the result.

Jonathan AC Roberts SSCoach Points: 18407 More actions · Answer 8

carsten.saastamoinen wrote:

I use your script and just show the result.

Ok data setup with the following rows:

insert into dbo.t1
select t.N, SUBSTRING(CONVERT(varchar(40), newid()), 1, 10)
from dbo.fnTally(1,10000000) t;

insert into dbo.t2
select t.N, SUBSTRING(CONVERT(varchar(40), newid()), 1, 20)
from dbo.fnTally(1,1000000) t;

insert into dbo.t2
select t.N+10000000, SUBSTRING(CONVERT(varchar(40), newid()), 1, 20)
from dbo.fnTally(1,1000000) t;

Results MERGE:

************************************* Start Merge

Table 't1'. Scan count 1, logical reads 3090684, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 't2'. Scan count 1, logical reads 9210, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.

 SQL Server Execution Times:
   CPU time = 4750 ms,  elapsed time = 4769 ms.

(2000000 rows affected)
************************************* End Merge

Results UPSERT:

************************************* Start Upsert

Table 't1'. Scan count 1, logical reads 41520, physical reads 0, page server reads 0, read-ahead reads 7, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 't2'. Scan count 1, logical reads 4623, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.

 SQL Server Execution Times:
   CPU time = 2047 ms,  elapsed time = 2069 ms.

(1000000 rows affected)
Table 't1'. Scan count 5, logical reads 3235233, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 't2'. Scan count 5, logical reads 9340, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.

 SQL Server Execution Times:
   CPU time = 3095 ms,  elapsed time = 2033 ms.

(1000000 rows affected)
************************************* End Upsert

So MERGE 4769 ms

UPDATE + INSERT 2069 ms + 2033 ms = 4102 ms

So it is still faster.

carsten.saastamoinen Old Hand Points: 341 More actions · Answer 9

Try to look at the CPU time

Jonathan AC Roberts SSCoach Points: 18407 More actions · Answer 10

carsten.saastamoinen wrote:

Try to look at the CPU time

Try looking up the word faster.

carsten.saastamoinen Old Hand Points: 341 More actions · Answer 11

YES! And the elapsed time will change more than CPU time on a production server with many users.

Jonathan AC Roberts SSCoach Points: 18407 More actions · Answer 12

carsten.saastamoinen wrote:

YES! And the elapsed time will change more than CPU time on a production server with many users.

I hope I don't sound too critical but that's not really true, production database servers are generally I/O bound because the performance of a database server is often limited by the speed at which data can be read from or written to the storage devices. This is especially true for database servers that handle large amounts of data or perform complex queries, which can require accessing multiple disk blocks and performing disk seeks. The CPU utilization of a database server is typically not the bottleneck, because modern CPUs are generally fast enough to handle the processing required for most database operations. Instead, the server's performance is limited by the speed at which it can access data from disk, which is typically slower than the CPU speed.

The Update/Insert is still faster on my machine than the Merge, so I was wondering what data did you use to set up your test?

How To Avoid MERGE in SQL Server

Cookies on SQLServerCentral