Blog Post

MERGE with OUTPUT clause

,

A few days back I was presented with a theoretical challenge, and now I thought I would share my solution with you. The challange was to keep track of changes on a table that we had only a readonly access to, and keep a history of the changes. Furthermore the table didn’t have a column to track the last modified date, but a primary key was available though.
We didn’t need to track the changes realtime, but once or twice a day was preferable. I came up with a solution using the MERGE stament, and applying the OUTPUT clause. Let me show a simple demo setup:

CREATE TABLE ReadOnlyTable (
    Id INT PRIMARY KEY,
    Col1 INT,
    Col2 INT,
    Timestamp DATETIME
)
CREATE TABLE HistoryTable (
    Id INT PRIMARY KEY,
    Col1 INT,
    Col2 INT,
    Timestamp DATETIME
)
GO
INSERT INTO ReadOnlyTable (Id, Col1, Col2, Timestamp)
VALUES
    (1, 1, 1, GETDATE()),
    (2, 2, 2, GETDATE()),
    (3, 3, 3, GETDATE())
GO

 

The ReadOnlyTable represents the table from a third party system, that we only have read access to.
The HistoryTable is a table to hold a copy of the snapshot at the scheduled compare times (once or twice a day).

I came up with this MERGE statement, which compares the entire content of the tables for changes – so it is not lightweight in any way. But luckily that wasn’t an issue Smiley The statement looks like this:

MERGE INTO HistoryTable Dst
USING ReadOnlyTable Src ON Dst.id = Src.id
WHEN MATCHED AND (Dst.col1 <> Src.col1 OR Dst.col2 <> Src.col2 OR Dst.Timestamp <> Src.Timestamp) THEN
    UPDATE
    SET Dst.col1 = Src.col1, Dst.col2 = Src.col2, Dst.Timestamp = src.Timestamp
WHEN NOT MATCHED BY TARGET THEN
    INSERT (id, col1, col2, Timestamp)
    VALUES (Src.id, Src.col1, Src.col2, Src.Timestamp)
WHEN NOT MATCHED BY SOURCE THEN
    DELETE
OUTPUT
    GETDATE() AS ChangeDate,
    COALESCE(Inserted.Id, Deleted.Id) AS Id,
    CASE
        WHEN Deleted.Id IS NULL AND Inserted.Id IS NOT NULL THEN 'i'
        WHEN Deleted.Id IS NOT NULL AND Inserted.Id IS NOT NULL THEN 'u'
        WHEN Deleted.Id IS NOT NULL AND Inserted.Id IS NULL THEN 'd'
    END AS ChangeType,
    deleted.col1 AS col1_before,
    deleted.col2 AS col2_before,
    deleted.Timestamp AS Timestamp_before,
    inserted.col1 AS col1_after,
    inserted.col2 AS col2_after,
    inserted.Timestamp AS Timestamp_after;

The basic MERGE statement just joins the source and destination on the Id column. If the Id exists in the destination table (HistoryTable) and if at least one of the other columns have changed, then we update. Rows that do not exist will be inserted. Deleted rows will simply be removed from the HistoryTable.

Given the three rows in ReadOnlyTable and no rows in HistoryTable, the execution of the MERGE statement will return this:

image

Because the rows are all new, there are no content in the *_before columns. To illustrate changes performed in the third party database, I will now update a few rows, and run the MERGE statement again:

 

UPDATE ReadOnlyTable SET Col1 = 42
WHERE Id = 1
GO
DELETE FROM ReadOnlyTable
WHERE Id = 2
GO
INSERT INTO ReadOnlyTable (Id, Col1, Col2, Timestamp)
VALUES (4, 4, 4, GETDATE())
GO

 

The MERGE now returns this:

image

 

Now we have a way to asynchronously track changes performed in the readonly table. All we need now, is to create a table to store the output data, and add the INTO <table name>. The final table and MERGE statement looks like this:

 

CREATE TABLE ReadOnlyTableChanges
(
    ChangeId INT IDENTITY PRIMARY KEY,
    ChangeDate DATETIME,
    Id INT,
    ChangeType CHAR(1),
    Col1_before INT,
    Col2_before INT,
    Timestamp_before DATETIME,
    Col1_after INT,
    Col2_after INT,
    Timestamp_after DATETIME
)
GO
MERGE INTO HistoryTable Dst
USING ReadOnlyTable Src ON Dst.id = Src.id
WHEN MATCHED AND (Dst.col1 <> Src.col1 OR Dst.col2 <> Src.col2 OR Dst.Timestamp <> Src.Timestamp) THEN
    UPDATE
    SET Dst.col1 = Src.col1, Dst.col2 = Src.col2, Dst.Timestamp = src.Timestamp
WHEN NOT MATCHED BY TARGET THEN
    INSERT (id, col1, col2, Timestamp)
    VALUES (Src.id, Src.col1, Src.col2, Src.Timestamp)
WHEN NOT MATCHED BY SOURCE THEN
    DELETE
OUTPUT
    GETDATE() AS ChangeDate,
    COALESCE(Inserted.Id, Deleted.Id) AS Id,
    CASE
        WHEN Deleted.Id IS NULL AND Inserted.Id IS NOT NULL THEN 'i'
        WHEN Deleted.Id IS NOT NULL AND Inserted.Id IS NOT NULL THEN 'u'
        WHEN Deleted.Id IS NOT NULL AND Inserted.Id IS NULL THEN 'd'
    END AS ChangeType,
    deleted.col1 AS col1_before,
    deleted.col2 AS col2_before,
    deleted.Timestamp AS Timestamp_before,
    inserted.col1 AS col1_after,
    inserted.col2 AS col2_after,
    inserted.Timestamp AS Timestamp_after
INTO ReadOnlyTableChanges; --This is added

 

Simple as that Smiley

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating