Brain Fried on a JOIN Question

  • Hello

    I am trying to compare two tables, OldTable and NewTable that have a unique identifier of Pos_No and want to see if any of the other values have changes. The code is below. However, I am sometimes not getting the results I want. Specifically, I know it did not pick up a change in Act_end. The rest of the much larger series of statements works fine but just missing something that is probably obvious.

    Any help greatly appreciated.

    Thanks!

    SELECT [compareHR].[dbo].[TableOLD].[Disc], [compareHR].[dbo].[TableOLD].[Pos_no], [compareHR].[dbo].[TableOLD].[Percent], [compareHR].[dbo].[TableOLD].[Title], [compareHR].[dbo].[TableOLD].[Emp_no], [compareHR].[dbo].[TableOLD].[Lastname], [compareHR].[dbo].[TableOLD].[Firstname], [compareHR].[dbo].[TableOLD].[Pl_start], [compareHR].[dbo].[TableOLD].[Pl_end], [compareHR].[dbo].[TableOLD].[Act_start], [compareHR].[dbo].[TableOLD].[Act_End], [compareHR].[dbo].[TableOLD].[Loc], [compareHR].[dbo].[TableOLD].[Org], [compareHR].[dbo].[TableNEW].[Disc], [compareHR].[dbo].[TableNEW].[Pos_no], [compareHR].[dbo].[TableNEW].[Percent], [compareHR].[dbo].[TableNEW].[Title], [compareHR].[dbo].[TableNEW].[Emp_no], [compareHR].[dbo].[TableNEW].[Lastname], [compareHR].[dbo].[TableNEW].[Firstname], [compareHR].[dbo].[TableNEW].[Pl_start], [compareHR].[dbo].[TableNEW].[Pl_end], [compareHR].[dbo].[TableNEW].[Act_start], [compareHR].[dbo].[TableNEW].[Act_End], [compareHR].[dbo].[TableNEW].[Loc], [compareHR].[dbo].[TableNEW].[Org]

    FROM [compareHR].[dbo].[TableOLD]

    JOIN [compareHR].[dbo].[TableNEW] ON (TableNEW.Pos_No = TableOLD.Pos_No)

    WHERE [compareHR].[dbo].[TableNEW].[Pos_no] = [compareHR].[dbo].[TableOLD].[Pos_no]

    AND ([compareHR].[dbo].[TableNEW].[Percent] <> [compareHR].[dbo].[TableOLD].[Percent]

    OR [compareHR].[dbo].[TableNEW].[Title] <> [compareHR].[dbo].[TableOLD].[Title]

    OR [compareHR].[dbo].[TableNEW].[Emp_no] <> [compareHR].[dbo].[TableOLD].[Emp_no]

    OR [compareHR].[dbo].[TableNEW].[Lastname] <> [compareHR].[dbo].[TableOLD].[Lastname]

    OR [compareHR].[dbo].[TableNEW].[Firstname] <> [compareHR].[dbo].[TableOLD].[Firstname]

    OR [compareHR].[dbo].[TableNEW].[Comp] <> [compareHR].[dbo].[TableOLD].[Comp]

    OR [compareHR].[dbo].[TableNEW].[Pl_start] <> [compareHR].[dbo].[TableOLD].[Pl_start]

    OR [compareHR].[dbo].[TableNEW].[Pl_end] <> [compareHR].[dbo].[TableOLD].[Pl_end]

    OR [compareHR].[dbo].[TableNEW].[Act_start] <> [compareHR].[dbo].[TableOLD].[Act_start]

    OR [compareHR].[dbo].[TableNEW].[Act_End] <> [compareHR].[dbo].[TableOLD].[Act_End]

    OR [compareHR].[dbo].[TableNEW].[Loc] <> [compareHR].[dbo].[TableOLD].[Loc]

    OR [compareHR].[dbo].[TableNEW].[Org] <> [compareHR].[dbo].[TableOLD].[Org])

    GO

  • You will not get the results you want if the values are NULL.

    When I compare tables I use a CHECKSUM.

    SELECT

    *

    FROM

    (SELECT

    [compareHR].[dbo].[TableOLD].[Pos_no],

    CHECKSUM(*) AS XSum

    FROM

    [compareHR].[dbo].[TableOLD]) AS Old

    FULL JOIN

    (SELECT

    [compareHR].[dbo].[TableNEW].[Pos_no],

    CHECKSUM(*) AS XSum

    FROM

    [compareHR].[dbo].[TableNEW]) AS New

    WHERE

    Old.Pos_no IS NULL -- Row doesn't exist in old table

    OR New.Pos_no IS NULL -- Row doesn't exist in new table

    OR Old.XSum <> New.XSum; -- Row is different

    The SQL Guy @ blogspot[/url]

    @SeanPearceSQL

    About Me[/url]

  • Ah!! Right...and since we have so few nulls I didn't think of that...for now, I might just take the easy way out as I get the files in Excel and only a few fields can have null I will just replace with a "-" or something. Thanks for finding my oversight!

  • The issue could be related to the nullability of some of the columns. The are different ways to tackle this but using CHECKSUM is not one of them because the possibility for collision is high (it is not a good hashing function).

    One way to accomplish this is using the INTERSECT operator. This operator treats NULL marks as equal so we can use something like:

    DECLARE @T1 TABLE (

    c1 int NOT NULL PRIMARY KEY,

    c2 int NULL,

    c3 varchar(25) NULL

    );

    DECLARE @T2 TABLE (

    c1 int NOT NULL PRIMARY KEY,

    c2 int NULL,

    c3 varchar(25) NULL

    );

    INSERT INTO @T1 (c1, c2, c3)

    VALUES (1, NULL, 'A'), (2, 2, 'B'), (3, 3, 'C');

    INSERT INTO @T2 (c1, c2, c3)

    VALUES (1, 1, 'A'), (2, 2, 'D'), (3, 3, 'C');

    SELECT

    A.c1,

    A.c2,

    A.c3,

    B.c2 AS B_c2,

    B.c3 AS B_c3

    FROM

    @T1 AS A

    INNER JOIN

    @T2 AS B

    ON B.c1 = A.c1

    WHERE

    NOT EXISTS (

    SELECT A.c2, A.c3

    INTERSECT

    SELECT B.c2, B.c3

    );

    GO

    There is an interesting suggestion in Connect.Microsoft.com asking to implement the DISTINCT predicate.

    Add language and optimizer support for ISO <distinct predicate>

    Feel free to vote if you think this could make our life easier. The new query will look like:

    SELECT

    A.c1,

    A.c2,

    A.c3,

    B.c2 AS B_c2,

    B.c3 AS B_c3

    FROM

    @T1 AS A

    INNER JOIN

    @T2 AS B

    ON B.c1 = A.c1

    WHERE

    (A.c2 IS DISTINCT FROM B.c2)

    OR

    (A.c3 is DISTINCT FROM B.c3)

    The similar implementation as of today would be:

    SELECT

    A.c1,

    A.c2,

    A.c3,

    B.c2 AS B_c2,

    B.c3 AS B_c3

    FROM

    @T1 AS A

    INNER JOIN

    @T2 AS B

    ON B.c1 = A.c1

    WHERE

    (A.c2 <> B.c2 OR (A.c2 IS NULL AND B.c2 IS NOT NULL) OR (A.c2 IS NOT NULL AND B.c2 IS NULL))

    OR

    (A.c3 <> B.c3 OR (A.c3 IS NULL AND B.c3 IS NOT NULL) OR (A.c3 IS NOT NULL AND B.c3 IS NULL))

    GO

    Ouch, it is not nice dealing with nullable columns.

  • Yup, you nailed it hunchback .... the previous sheets I got did not have any nulls and I forgot that....just found out they only need this task to be done once a week for 2 sheets & I have to modify a few things in Excel to make it "SSIS friendly" so might just cheat with a dummy value that I know could never be used unless this becomes a bigger "project" and I have to do this 5 minute task often. Cheers all.

    -Dave

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply