Comparing data from two database for a list of selected tables more than 400 out of 1000+ tables

  • Comparing db schema is easy using VS 2013 and up versions and we can compare two databases without any problem.

    For data compare: VS provided us to select list of tables. In my case I have some lookup tables which are around 400 and selecting 400 tables from given database list of tables for compare is time consuming.

    Is there any way or tool with which I can provide my list to be compared by that tool?

    Selecting 400 tables from a given list of tables takes a lot of time and repeating this manually is very hard, takes alot of time for each compare. Please let me know if we can use VS2013 api or something by which we can compare data easily or any such tool is available.

    Shamshad Ali

  • I would recommend Redgates SQL Compare tools. Very easy and quick to use. You can download a 14 day trial.

  • The data comparison feature you're using in Visual Studio; is that the same thing as the SQL Server Data Tools plugin? I've found an MSDN article which seems to suggest that you can select multiple tables for which to perform a data comparison. If you're not using latest version of SSDT, then download it or read documentation on data compare feature.

    Compare and Synchronize Data in One or More Tables with Data in a Reference Database

    https://msdn.microsoft.com/en-us/library/dn266029(v=vs.103).aspx

    .. You can compare the data in a source database and a target database and specify which tables should be compared. You can review the data and decide which changes to synchronize. You can then either update the target to synchronize the databases or export the update script to the Transact-SQL editor or to a file. ..

    Download Latest SQL Server Data Tools

    https://msdn.microsoft.com/en-us/library/mt204009.aspx

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Eric M Russell (12/17/2015)


    The data comparison feature you're using in Visual Studio; is that the same thing as the SQL Server Data Tools plugin? I've found an MSDN article which seems to suggest that you can select multiple tables for which to perform a data comparison. If you're not using latest version of SSDT, then download it or read documentation on data compare feature.

    Compare and Synchronize Data in One or More Tables with Data in a Reference Database

    https://msdn.microsoft.com/en-us/library/dn266029(v=vs.103).aspx

    .. You can compare the data in a source database and a target database and specify which tables should be compared. You can review the data and decide which changes to synchronize. You can then either update the target to synchronize the databases or export the update script to the Transact-SQL editor or to a file. ..

    Download Latest SQL Server Data Tools

    https://msdn.microsoft.com/en-us/library/mt204009.aspx

    selecting 400 tables is tedious, does that tool accept a list of tables?

    edit: red gate sql compare looks like it has some options to create a list of tables to process, but maybe theres a command line version of Microsoft's that you can iterate through a list of tables? I just couldn't see it in the linked page.

  • A quick Google search for (+"SQL SERVER" +SSDT +PowerShell + Automation) did not turn up anything useful except for answered inquiries in Microsoft forums stating that it's not supported at this time. However, the inquiries were from back in 2012, so it may be worth digging into further to see if it's supported in more recent releases. At least it doesn't appear to be officially documented.

    One thing can do is, for each of the 400 tables, query an aggregate checksum across all rows and all columns, or all rows and a list of specific columns should be coded identically between the two databases. Run this for both databases, and then check for differences between the two results using text comparison tool like WinMerge.

    So, how do you easily run an aggregate checksum across 400 tables? You do it by using an INFORMATION_SCHEMA query to build up each individual table query. What each query will do is compute a single 4 byte checksum value for each table.

    SELECT 'SELECT '''+TABLE_SCHEMA+'.'+TABLE_NAME

    +''' AS TABLE_NAME, CHECKSUM_AGG(CHECKSUM(*)) AS CHKSUM FROM '

    +TABLE_SCHEMA+'.'+TABLE_NAME+';'

    FROM INFORMATION_SCHEMA.TABLES

    ORDER BY TABLE_SCHEMA+'.'+TABLE_NAME;

    Here is an example of what it will produce:

    SELECT 'dbo.DimAccount' AS TABLE_NAME

    , CHECKSUM_AGG(CHECKSUM(*)) AS CHKSUM FROM dbo.DimAccount;

    SELECT 'dbo.DimChannel' AS TABLE_NAME

    , CHECKSUM_AGG(CHECKSUM(*)) AS CHKSUM FROM dbo.DimChannel;

    SELECT 'dbo.DimCurrency' AS TABLE_NAME

    , CHECKSUM_AGG(CHECKSUM(*)) AS CHKSUM FROM dbo.DimCurrency;

    ...

    Yes, I know what some of you are thinking, but this technique is surprisingly fast, requiring a single full scan for each table. Any data comparison tool which require that at a minimum. This technique described above is actually faster than the column by column comparison used by tools like SSDT and Red Gate. Of course, it doesn't tell you which specific rows and columns differ, but it will tell you which specific tables are different (hopefully only a much smaller subset of tables), so you can then drill down further using a row/column comparison tool like SSDT.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Tried given solution and put some efforts, it seems typical and can't help me. Still looking forward on my problem I requested here.

  • shamshad.ali (12/22/2015)


    Tried given solution and put some efforts, it seems typical and can't help me. Still looking forward on my problem I requested here.

    Just out of curiousity, how far off was Eric's solution? Did he come close to what you needed? If you could identify where his solution failed to apply, maybe we could suggest modifications to it?

  • On the first step the checksum results were different, even there is no change in tables (100%) similar tables have given me different checksum values, that would be a base to identify differences and pick those objects for further comparison. Made me and left it because I had a list of 1000+ tables. Can't concentrate due to high volume of other activities, that's why i asked here if i can resolve it quickly with the help of experts from here.

    Shamshad Ali

  • So what products have you tried? I haven't used Red Gate's tool, but have heard good things about it.

    Please know that, given the volume of tables you're talking about, the process may very well take a long time to run. You might be better off doing this on a copy of production or during downtime.

  • Around 30 tables do not show checksum value for given tables. Kindly let me know what would be the cause that checksum is null for those tables?

    Shamshad

  • There is a tool available that is much cheaper than RedGate and will perform the row by row comparisons of the data that you require. It allows you to filter the list of tables to be compared on a range of criteria.

    You can download a trial version here[/url]

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply