Testing Two Tables Data for Equality

  • Imagine we have two very simple tables, I show them in the set format:

    T1 = {1,2,3}

    T2 = {1,2,4}

    T1 ≠ T2

    The goal is checking the two tables for equality and I know a lot of technique for solving it. but I am looking for a global method.

    I have an idea but not sure it is true in all cases.

    The idea is:

    SELECT CASE WHEN (SELECT SUM(position * value)

    FROM (

    SELECT value, ROW_NUMBER() OVER(ORDER BY value)

    FROM T1

    ) AS D(value, position)

    ) =

    (SELECT SUM(position * value)

    FROM (

    SELECT value, ROW_NUMBER() OVER(ORDER BY value)

    FROM T1

    ) AS D(value, position)

    THEN 'Equal'

    ELSE 'Not Equal'

    END;

    What's your opinion about this method, is it responding every time for every instances?

  • You can check with BINARY_CHECKSUM()

    BINARY_CHECKSUM can be used to detect changes to a row of a table.

    declare @Temp table(Col0 int ,Col1 int ,Col2 int )

    declare @Temp1 table(Col0 int ,Col1 int ,Col2 int )

    insert @Temp select 1,2,3

    insert @Temp1 select 1,2,4

    Select BINARY_CHECKSUM(*) from @Temp

    Select BINARY_CHECKSUM(*) from @Temp1

    Here you can see the difference @Temp will return 291 and @Temp1 will return 292 so it is "NOT EQUAL"

    Thanks
    Parthi

  • Parthi,

    I think he was showing each table having three rows, not three columns.

    As far as a way to compare the complete contents of two identically structured tables you could try something like this:

    CREATE TABLE #T1 (ID INT);

    CREATE TABLE #T2 (ID INT);

    INSERT INTO #T1 VALUES (1), (2), (3);

    INSERT INTO #T2 VALUES (1), (2), (4);

    IF (SELECT COUNT(*) FROM #T1) = (SELECT COUNT(*) FROM #T2)

    AND NOT EXISTS(SELECT * FROM #T1 EXCEPT SELECT * FROM #T2)

    AND NOT EXISTS(SELECT * FROM #T2 EXCEPT SELECT * FROM #T1)

    PRINT 'Equal';

    ELSE

    PRINT 'Not Equal';

    However that wouldn't always identify differences if your table doesn't have a unique primary key.

  • An incredibly quick and dirty way:

    CREATE TABLE #T1 (ID INT);

    CREATE TABLE #T2 (ID INT);

    INSERT INTO #T1 VALUES (1)

    INSERT INTO #T1 VALUES (2)

    INSERT INTO #T1 VALUES (3)

    INSERT INTO #T2 VALUES (1)

    INSERT INTO #T2 VALUES (2)

    INSERT INTO #T2 VALUES (4)

    SELECT * from #t1

    EXCEPT

    SELECT * FROM #t2

    SELECT * from #t2

    EXCEPT

    SELECT * FROM #t1


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Craig,

    That is a portion of what I suggested, but it really falls short.

    For example if the first table contains 7 rows with a value of 1 and the second table contains 1 row with a value of 1 then it would show the two tables match. (Which is why I included the row count, which gets you closer but not all the way.)

  • UMG Developer (4/1/2011)


    Craig,

    That is a portion of what I suggested, but it really falls short.

    For example if the first table contains 7 rows with a value of 1 and the second table contains 1 row with a value of 1 then it would show the two tables match. (Which is why I included the row count, which gets you closer but not all the way.)

    True enough. To truly test the two tables against each other though you'd need to have the business key to do compares with at the row level. My apologies, I hadn't noticed the EXCEPT component in your script. I was just going to fast when I was at work and buzzed past it, noticed the request, and just kept trucking with a quick way.

    I assumed he was going to have to expand on the idea eventually but something like that would at least let him start doing row research.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Here is an option that works better, though I still won't guarantee 100% accuracy. (I haven't tested every possible combination, like the use of LOB data types.)

    CREATE TABLE #T1 (ID INT);

    CREATE TABLE #T2 (ID INT);

    INSERT INTO #T1 VALUES (1), (2), (2), (3), (3);

    INSERT INTO #T2 VALUES (1), (2), (3), (1), (1);

    IF NOT EXISTS(SELECT *, ROW_NUMBER() OVER (PARTITION BY ID ORDER BY ID) AS RN FROM #T1

    EXCEPT SELECT *, ROW_NUMBER() OVER (PARTITION BY ID ORDER BY ID) AS RN FROM #T2)

    AND NOT EXISTS(SELECT *, ROW_NUMBER() OVER (PARTITION BY ID ORDER BY ID) AS RN FROM #T2

    EXCEPT SELECT *, ROW_NUMBER() OVER (PARTITION BY ID ORDER BY ID) AS RN FROM #T1)

    PRINT 'Equal';

    ELSE

    PRINT 'Not Equal';

    For this to work you would have to list every column in the table in the PARTITION BY clause, but by doing that you will catch when there are a different number of rows in the two tables with the exact same values. (Which column(s) you put in the ORDER BY portion won't make any difference.)

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply