Technical Article

Compare data of two tables

,

I often come across the need to compare data from 2 tables in SQL Server and see if there is any mismatching data in tables. I know, there are loads of tools available in market for this but the good one I know are all paid tools. The price doesn't look worth it when I know I'll not need the tool throughout the year and i'll need to pay the renew fee again after year end. So I thought of writing a script to dynamically compare the data between any 2 given tables on a server. The script will return the all the mismatching columns along with data. If there are no mismatching columns, it will tell that all data is identical. You may also choose to return only needed number of records after comparision. e.g. if your tables have 2 million records and there are mismatches, then you just want to see first 100 records along with the mismatching columns, then you can specify the @TopNRecords parameter.

The good thing about script is that I have not used any loops in it and comparison is done based on joins. You can specify the joining keys, which normally should be your primary key. But you may choose to use different columns for comparison.

So enjoy the script and leave your comments if you find any bugs.

/****** Object: StoredProcedure [dbo].[usp_CompareTable] Script Date: 08-30-2019 00:59:18 ******/-- =======================================================================================================================================================
-- Purpose : This stored procedure will compare data of two tables.
-- No. of Parameter : 6
-- @db1 : First Database Name
-- @Tb1 : Table Name of First Database
-- @db2 : Second Database Name
-- @Tb2 : Table Name of Second Database
-- @Matching Key: Matching Keys
-- @TopN : No. of records you want to display after comparision
-- =======================================================================================================================================================
SET ANSI_NULLS ON;
GO

SET QUOTED_IDENTIFIER ON;
GO

SET ANSI_WARNINGS OFF;
GO

ALTER PROCEDURE dbo.usp_CompareTable
(   @Db1         VARCHAR(100),
    @Tb1         VARCHAR(100),
    @Db2         VARCHAR(100),
    @Tb2         VARCHAR(100),
    @MatchingKey VARCHAR(500),
    @TopN        INT = NULL)
AS
BEGIN
    DECLARE @JoinCondition VARCHAR(MAX) = '';
    DECLARE @i INT = 0;
    DECLARE @as VARCHAR(200);
    SELECT
          value,
          rnm = ROW_NUMBER() OVER (ORDER BY value)
     INTO #ColumnList
     FROM STRING_SPLIT(@MatchingKey, ',');
    SELECT @i = COUNT(1) FROM #ColumnList;
    WHILE @i > 0
    BEGIN

        --Prepare JOIN condition
        SELECT
               @JoinCondition
            = @JoinCondition + ' A' + '.' + value + '=' + ' B' + '.' + value
         FROM  #ColumnList
         WHERE rnm = @i;
        SET @i = @i - 1;
        IF @i <> 0
            SET @JoinCondition = @JoinCondition + ' AND ';
    END;

    --Prepare CASE condition
    DECLARE @CaseCondition VARCHAR(MAX);
    CREATE TABLE #cases
    (CaseFromSelect VARCHAR(MAX));
    INSERT INTO
        #cases
    EXEC ('SELECT ''CASE WHEN A.''+A.COLUMN_NAME+'' <> B.''+B.COLUMN_NAME+'' THEN ''''''+A.COLUMN_NAME+

''''''''+'' ELSE '''''''' END '' +A.COLUMN_NAME

FROM ' + @Db1 + '.INFORMATION_SCHEMA.COLUMNS A INNER JOIN ' + @Db2 + '.INFORMATION_SCHEMA.COLUMNS B

ON A.COLUMN_NAME=B.COLUMN_NAME WHERE A.Table_name =''' + @Tb1 + ''' and A.Table_catalog=''' + @Db1 + ''' AND B.table_name =''' + @Tb2 + ''' AND B.Table_catalog=''' + @Db2 + '''');
    SET @CaseCondition = '';
    SELECT @CaseCondition = @CaseCondition + CaseFromSelect + ', ' FROM #cases;
    SELECT @CaseCondition = SUBSTRING(@CaseCondition, 0, LEN(@CaseCondition));
    EXEC ('SELECT ' + @CaseCondition + ' INTO ##T123 FROM ' + @Db1 + '..' + @Tb1 + ' A ' + ' INNER JOIN ' + @Db2 + '..' + @Tb2 + ' B ' + ' ON ' + @JoinCondition);
    CREATE TABLE #Mismatchingcolumns
    (Columnlist VARCHAR(5000));
    DECLARE @s VARCHAR(MAX) = '';
    SELECT @s = @s + 'SELECT ' + COLUMN_NAME + ' FROM ##t123 UNION '
     FROM  tempdb.INFORMATION_SCHEMA.COLUMNS
     WHERE TABLE_NAME = '##T123';
    SELECT @s = SUBSTRING(@s, 1, LEN(@s) - 6);
    INSERT INTO #Mismatchingcolumns EXEC (@s);
    UPDATE #Mismatchingcolumns
     SET   Columnlist = ' A ' + '.' + Columnlist + ' , ' + ' B ' + '.' + Columnlist
     WHERE Columnlist <> '';
    DECLARE @Clist VARCHAR(MAX) = '';
    SELECT @Clist = @Clist + (Columnlist) + ', '
     FROM  #Mismatchingcolumns
     WHERE Columnlist <> '';
    SELECT @Clist = SUBSTRING(@Clist, 0, LEN(@Clist));
    IF
    (
        SELECT COUNT(*) FROM #Mismatchingcolumns WHERE Columnlist <> ''
    ) = 0
        SELECT 'All data identical';
    ELSE
    BEGIN
        IF @TopN IS NULL
            SET @TopN = 999999999;
        DECLARE @FinalQuery VARCHAR(MAX) = '';
        SET @FinalQuery
            = 'SELECT TOP ' + CAST(@TopN AS VARCHAR(10)) + ' ' + @Clist + ' FROM ' + @Db1
              + '..' + @Tb1 + ' A ' + ' INNER JOIN ' + @Db2 + '..' + @Tb2 + ' B '
              + ' ON ' + @JoinCondition;
        EXEC (@FinalQuery);
    END;
    DROP TABLE ##T123;
    DROP TABLE #Mismatchingcolumns;
END;

Rate

3.25 (4)

You rated this post out of 5. Change rating

Share

Share

Rate

3.25 (4)

You rated this post out of 5. Change rating