Home Forums SQL Server 2012 SQL 2012 - General Comparing column data in 2 tables with the same schema containing 1.5 billion records RE: Comparing column data in 2 tables with the same schema containing 1.5 billion records

  • JayK (4/14/2013)


    I have had the task of recreating a table in our data warehouse which contains 1.5 billions rows to make use of partioning on the clustered index by date. To keep the tables in sync over the past few weeks I have had to run import jobs on both tables.

    Before moving across to the new partitioned table (by renaming it to bring it into production) I need to ensure that the EffectiveEndDate column in both tables is identical for every row (which has a PK).

    So my question is how best is it to compare 2 tables with 1.5 billion records to ensure the data in both is the same?

    At this stage 3rd party tools are not an option and I am using SQL Server 2012 SP1 Enterprise Edition.

    Any help greatly appreciated!!

    Start simple. Create a unique index on each table with columns ID and EffectiveEndDate.

    Then run SELECT

    t2.ID, t1.ID,

    t2.EffectiveEndDate, t1.EffectiveEndDate

    FROM Table1 t1

    INNER JOIN Table2 t2

    ON t2.ID = t1.ID

    AND t2.EffectiveEndDate <> t1.EffectiveEndDate

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden