How to Compare the Data in Two Tables Without Any 3rd-Party Tool?

  • Sarab_SQLChamp

    Hall of Fame

    Points: 3882

    Comments posted to this topic are about the item How to Compare the Data in Two Tables Without Any 3rd-Party Tool?

    Regards,
    Sarabpreet Singh 😎
    Sarabpreet.com
    SQLChamp.com
    Twitter: @Sarab_SQLGeek

  • hr_sn

    Ten Centuries

    Points: 1062

    When I figured about this tool a while ago, I was same.. WOW.. And we just ended up using this tool at one of our DW implementation for change detection. It works fine but wouldn't be using it in future projects. I think EXCEPT function can be much better and gives you more control on what can be done. From my experience, what I have found this tool has got some limitations as below (but that could be just my limited knowledge about this tool):

    1. Requires primary key definition (as stated in article, so no good with heap tables)

    2. If no difference's detected, the diff table (assuming you are sending differences to table) will not be updated with no rows. Diff table will still show the differences from last comparison. This behavior got me for some time

    3. Diff table inherits the schema of user executing the command. (This can be an issue where generic account is used. We had to run extra T-SQL to change the schema. Would be easier if we could assign default schema to AD groups which is kind of a bug at the moment with MSSQL)

    I think for adhoc requirements this tool can be good but so as EXCEPT where I don't have to remember the syntax. :hehe:

    Regards,

    - Harish

  • Hardy21

    SSCrazy Eights

    Points: 9708

    hr_sn (8/5/2011)


    When I figured about this tool a while ago, I was same.. WOW.. And we just ended up using this tool at one of our DW implementation for change detection. It works fine but wouldn't be using it in future projects. I think EXCEPT function can be much better and gives you more control on what can be done. From my experience, what I have found this tool has got some limitations as below (but that could be just my limited knowledge about this tool):

    1. Requires primary key definition (as stated in article, so no good with heap tables)

    2. If no difference's detected, the diff table (assuming you are sending differences to table) will not be updated with no rows. Diff table will still show the differences from last comparison. This behavior got me for some time

    3. Diff table inherits the schema of user executing the command. (This can be an issue where generic account is used. We had to run extra T-SQL to change the schema. Would be easier if we could assign default schema to AD groups which is kind of a bug at the moment with MSSQL)

    I think for adhoc requirements this tool can be good but so as EXCEPT where I don't have to remember the syntax. :hehe:

    Regards,

    - Harish

    Your point is very valid - EXCEPT is a good option than tablediff.

    Thanks

  • kramaswamy

    SSCoach

    Points: 18135

    yeah you could write a custom script to do this, by taking EXCEPT on both sides, and UNION to see the ones which are the same. it is kinda nice to have one already written though.

    has anyone done a performance analysis to see how well this does against large tables? the fact that it took ~7 seconds to run against a table which had only four records is kinda disconcerting. what if the tables had millions of records?

  • john.arnott

    SSChampion

    Points: 11882

    Sarabpreet,

    Thank you for this straight-forward description and instructions on Tablediff.

    Yes, the EXCEPT operator is easy to use to find rows that differ between tables, but then Tablediff also can give you HOW they differ, a separate question which would need a lot more coding with a TSQL script. As with most everything in IT, there are multiple ways to approach a change analysis question and it's good to understand the strengths of the various tools available.

  • AmolNaik

    SSCarpal Tunnel

    Points: 4767

    Thanks for posting this article. But i think EXCEPT clause will do the trick. It is powerful and neat.

    Amol Naik

  • alen teplitsky

    SSC-Dedicated

    Points: 30014

    can you use this to monitor replication? say compare the publisher and subscriber tables? how long would it take to get a result on tables with tens of millions of rows on modern hardware like Proliant G5's and G7's?

  • Sarab_SQLChamp

    Hall of Fame

    Points: 3882

    Obviously this can be used in replication scenarios. Can't really comment on Performance part, as it will depend on many things like: cpu\ mem usage, workload while checking comparison, no. of rows etc., never really got an opportunity to test it on such an environment.

    Once tested Do share your stats.

    Regards,
    Sarabpreet Singh 😎
    Sarabpreet.com
    SQLChamp.com
    Twitter: @Sarab_SQLGeek

  • hr_sn

    Ten Centuries

    Points: 1062

    Hi Alen,

    This tool essentially is meant for comparing publisher and subscriber tables (http://msdn.microsoft.com/en-us/library/ms162843.aspx). I've tried this tool to compare tables with about 25 million records on pretty standard server (Quad core + 4GB RAM, on VM) and it took about 15 minutes. This test I did was on same VM machine, same instance but different DBs.

    HTH!

    Cheers,

    - Harish

Viewing 9 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply