Using EXCEPT to Determine Row Differences

  • M. Douglas Bowan

    SSC-Addicted

    Points: 459

    Comments posted to this topic are about the item Using EXCEPT to Determine Row Differences

  • nick.latocha

    SSCrazy

    Points: 2051

    Something to note about EXCEPT is that it is a DISTINCT Query and does not compare duplicate rows. Any suggestions on how you would get around this?

  • M. Douglas Bowan

    SSC-Addicted

    Points: 459

    Where I have been using this is to keep read-only data synchronized between two areas by their primary key, so I do not currently have a scenario like this.

    If I was keeping a full history of everything synchronized to date that this might be similar to what you describe.

    Could you explain further?

  • Arno Ho

    SSCommitted

    Points: 1652

    hello,

    The join between #source and #Target in the SELECT .... EXCEPT is not essential

    You can obtain the same result with a best cost

    SELECT .[ID]

    , .[Item]

    , .[Price]

    , .[OrderDate]

    , .[Units]

    , .[ShipmentDate]

    FROM [Source]

    EXCEPT

    SELECT [T].[ID]

    , [T].[Item]

    , [T].[Price]

    , [T].[OrderDate]

    , [T].[Units]

    , [T].[ShipmentDate]

    FROM [Target] [T]

    ;

    Am I wrong?

    Arnaud

  • nick.latocha

    SSCrazy

    Points: 2051

    Lets say we have a reconciliation against a 3rd party vendor where we submit sales to them.

    There is a chance we can submit the same sale twice by accident. We would then only have 1 sale in our system but the vendor would see 2 sales. How do we use EXCEPT to reconcile these kinds of differences?

  • simon.c.jones

    SSC Journeyman

    Points: 91

    I agree with Old Hand. I tried the comparison both with INNER JOINs and without. I got the same result in both instances.

  • andrea.onah

    SSC Journeyman

    Points: 95

    I have created, inserted the values and run the query as posted in this article and it produces the same result as

    SELECT .[ID]

    , .[Item]

    , .[Price]

    , .[OrderDate]

    , .[Units]

    , .[ShipmentDate]

    FROM [#Source]

    EXCEPT

    SELECT [T].[ID]

    , [T].[Item]

    , [T].[Price]

    , [T].[OrderDate]

    , [T].[Units]

    , [T].[ShipmentDate]

    FROM [#Target] [T]

    Still therefore trying to understand the use of the join

  • David McKinney

    SSChampion

    Points: 10358

    the join is to restrict the comparison to those cases where you have the same key values present in both source and value - and are looking for differences (rather than present in one and not in another.)

    Agreed that with the test data provided, this is not very apparent. if you add a row to the source table with an ID which is not present in the Target table, then you'll see the effect of the join.

  • David McKinney

    SSChampion

    Points: 10358

    you can use UNION also to do something similar. In this case, it returns both rows when there's a difference.

    SELECT *

    FROM (SELECT *

    , count(1) OVER (PARTITION BY ID) AS versions

    FROM (SELECT *

    FROM #Source AS ss

    UNION

    SELECT *

    FROM #Target AS ss

    ) x

    ) y

    WHERE versions <> 1

  • M. Douglas Bowan

    SSC-Addicted

    Points: 459

    Without the join, the except statement would also return rows in the source that do not exist in the target as well as those that exist (by id) in both sets but are different.

    What I needed was to determine differences between rows that exist between both sets, but have differences.

    Having a series of tables with 100, 200+ columns to compare is asking for human error, especially when nulls need to be taken into account.

  • M. Douglas Bowan

    SSC-Addicted

    Points: 459

    thank you for your explanation.

    having rows in the source that do not exist in the target as you describe is precisely why the join exists and the test data does not reflect that.

  • Brymen

    SSC-Addicted

    Points: 464

    In other words, the article shows how to use [font="Courier New"]EXCEPT[/font] to do something really cool. Compare records with matching primary keys to see if they are identical. But the example doesn't demonstrate the cool part. OK, I get it.

  • SqlMel

    SSCrazy

    Points: 2891

    This is a much better approach than comparing column by column.

    ---------------
    Mel. 😎

  • x

    SSC-Insane

    Points: 23349

    nick.latocha (3/25/2014)


    Lets say we have a reconciliation against a 3rd party vendor where we submit sales to them.

    There is a chance we can submit the same sale twice by accident. We would then only have 1 sale in our system but the vendor would see 2 sales. How do we use EXCEPT to reconcile these kinds of differences?

    When I was doing something similar (but used column by column compares so not as nice as this method), and the tables I compared had a chance of duplication, I added a column that was a count of rows, and did a group by everything to achieve it. Then this extra 'count' column could also be compared.

    Could be interesting to hear of an easier method, but thats what I came up with!

  • Chris Umbaugh

    Ten Centuries

    Points: 1254

    Nice article Douglas. I though you were crazy until I tried your code compared to what I had in my head. Pretty cool! I am faced with nearly an identical challenge today. Think I'll give it a try.

    Chris Umbaugh
    Data Warehouse / Business Intelligence Consultant
    twitter @ToledoSQL

Viewing 15 posts - 1 through 15 (of 73 total)

You must be logged in to reply to this topic. Login to reply