Compare data between rows of same table

  • create table ##Temp1 (

    myid int identity (1, 1),

    mytab1col1 int,

    mytab1col2 varchar(10),

    mytab1col3 smalldatetime)

    insert into ##Temp1 (mytab1col1, mytab1col2, mytab1col3)

    select 1, 'name1', '2010-01-01'

    union all

    select 1, 'name2', '2010-01-01'

    union all

    select 1, 'name3', '2010-01-02'

    union all

    select 1, 'name4', '2010-01-01'

    create table ##Temp2 (

    myid int identity (1, 1),

    mytab1col1 int,

    mytab1col2 varchar(10),

    mytab1col3 smalldatetime)

    insert into ##Temp2 (mytab1col1, mytab1col2, mytab1col3)

    select 2, 'name1', '2010-01-01'

    union all

    select 2, 'name2', '2010-01-01'

    union all

    select 2, 'name3', '2010-01-02'

    union all

    select 2, 'name5', '2010-01-01'

    union all

    select 3, 'name1', '2010-01-01'

    union all

    select 3, 'name2', '2010-01-01'

    union all

    select 3, 'name3', '2010-01-02'

    union all

    select 4, 'name1', '2010-01-02'

    union all

    select 4, 'name2', '2010-01-01'

    union all

    select 4, 'name3', '2010-01-02'

    union all

    select 4, 'name4', '2010-01-01'

    --Result should be

    Select1, 2

    union all

    SelectNULL, 2

    union all

    Select1, 3

    union all

    Select1, 4

    drop table ##Temp1

    drop table ##Temp2

    go

    Dear friends,

    I have a requirement n project to compare data of a table for reporting purpose. The above sample code will give you an idea about what i need. My table contains an identity column and a foreign key. I will get a foreign key as a source id and this id may have multiple records in the table. I take this source records (in above case records in ##Temp1) in a temporary table and i have to compare this source records with other records in original table (in above case records in ##Temp2). In the sample I gave, mentioned the result I am expecting.

    The first record in result is generated because source table have 'name4' with date and which is not available in target table for id 2.

    The second record in result is generated because target table have 'name5' and which is not available in source table.

    The third record in result is generated because source table have 'name4' and which is not available in target table for id 3.

    The fourth record in result is generated because target table have 'name1' with date '2010-01-02' and which is mismatched with record in source table.

    Could you help me to create a script for generating this output? Please give me an idea. Right now I have is a very complex script contains left join which will not give the exact result.

  • Hi

    Simply use EXCEPT.

    The following

    select col1,col2,...,colN from table1

    EXCEPT

    select col1,col2,...,colN from table2

    will give you the different rows.

    Another option is to use some tool for data/base comparison

    Regards

    IgorMi

    Igor Micev,My blog: www.igormicev.com

  • Since I have different ID's and I need them to be displayed in result and I don't want to compare the ID's but all other columns, EXCEPT will not work.

  • Maybe you should do a little work around.

    What about this?

    select t1.mytab1col1,t2.mytab1col1

    from ##Temp1 t1 left join ##Temp2 t2 on t1.mytab1col2=t2.mytab1col2

    and t1.mytab1col3=t2.mytab1col3 and t1.mytab1col1 != t2.mytab1col1

    group by t1.mytab1col1,t2.mytab1col1

    order by t1.mytab1col1 asc

    Igor Micev,My blog: www.igormicev.com

  • That query is having a problem, lets say the records with id value 4 in column mytab1col1 doesn't have any mismatch. The result still displays id 4, which it should not because the requirement is to display only mismatched records.

  • iam777here4u (4/25/2013)


    That query is having a problem, lets say the records with id value 4 in column mytab1col1 doesn't have any mismatch. The result still displays id 4, which it should not because the requirement is to display only mismatched records.

    You've written

    --Result should be

    Select1, 2

    union all

    SelectNULL, 2

    union all

    Select1, 3

    union all

    Select1, 4

    Igor Micev,My blog: www.igormicev.com

  • That's true and it is because there was below mismatch with respect to Id 4.

    ##Temp1 contains

    select 1, 'name1', '2010-01-01'

    ##Temp2 contains

    select 4, 'name1', '2010-01-02' --> The eighth record in original script: date mismatch

    Lets say both have date '2010-01-01' which means no mismatch with ##Temp1 and ##Temp2 records with respect to ID 4 (only ID differ which we are not comparing). This time 4 should not be displayed

    I was working on this morning onwards and what I did is below:

    select *

    from ##Temp2 t2

    left join ##Temp1 t1

    on t1.mytab1col2=t2.mytab1col2

    and t1.mytab1col3=t2.mytab1col3

    where t1.mytab1col1 is null

    order by t1.mytab1col1

    Other than this script, I used another script which will compare the count(*) group by mytab1col1 to get the mismatch in counts. But this seems complex to me.

  • iam777here4u (4/25/2013)


    That query is having a problem, lets say the records with id value 4 in column mytab1col1 doesn't have any mismatch. The result still displays id 4, which it should not because the requirement is to display only mismatched records.

    Please review your expected output table carefully. There are no column names, which would be useful. In the explanation for the four rows shown you refer to column ID, but elsewhere in your post, the column ID is irrelevant to the exercise. A precise definition will reward you with a precise answer.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Sorry for the trouble. I hope below code will give better idea.

    create table ##Temp1 (

    mytab1col1 int,

    mytab1col2 varchar(10),

    mytab1col3 smalldatetime)

    insert into ##Temp1 (mytab1col1, mytab1col2, mytab1col3)

    select 1, 'name1', '2010-01-01'

    union all

    select 1, 'name2', '2010-01-01'

    union all

    select 1, 'name3', '2010-01-02'

    union all

    select 1, 'name4', '2010-01-01'

    create table ##Temp2 (

    mytab2col1 int,

    mytab2col2 varchar(10),

    mytab2col3 smalldatetime)

    insert into ##Temp2 (mytab2col1, mytab2col2, mytab2col3)

    select 2, 'name1', '2010-01-01'

    union all

    select 2, 'name2', '2010-01-01'

    union all

    select 2, 'name3', '2010-01-02'

    union all

    select 2, 'name5', '2010-01-01'

    union all

    select 3, 'name1', '2010-01-01'

    union all

    select 3, 'name2', '2010-01-01'

    union all

    select 3, 'name3', '2010-01-02'

    union all

    select 4, 'name1', '2010-01-02'

    union all

    select 4, 'name2', '2010-01-01'

    union all

    select 4, 'name3', '2010-01-02'

    union all

    select 4, 'name4', '2010-01-01'

    union all

    select 5, 'name1', '2010-01-01'

    union all

    select 5, 'name2', '2010-01-01'

    union all

    select 5, 'name3', '2010-01-02'

    union all

    select 5, 'name4', '2010-01-01'

    union all

    select 5, 'name5', '2010-01-01'

    union all

    select 6, 'name1', '2010-01-01'

    union all

    select 6, 'name2', '2010-01-01'

    union all

    select 6, 'name3', '2010-01-02'

    union all

    select 6, 'name4', '2010-01-01'

    -- Result should be

    mytab2col1 mytab1col1

    ----------- ----------

    2 1 --> Id 2 displayed because it is missing name4

    2 1 --> Id 2 displayed again because it is having extra record name5

    3 1 --> Id 3 displayed because it is missing name4

    4 1 --> Id 4 displayed because it is having mismatch in date (please note the first record of ID 4)

    5 1 --> Id 5 displayed because it is having extra record name5

    --> Id 6 not displayed as no mismatch is occured

    --Please note that i want to compare records with all other columns except ID. The ID column should be displayed as result

    drop table ##Temp1

    drop table ##Temp2

    go

  • iam777here4u (4/25/2013)


    Sorry for the trouble. I hope below code will give better idea....

    No worries. I like this better, can we use it? I think folks will find it easier to relate to and to discuss, and it removes those global temp tables too, which are rarely a good idea:

    create table #Source (

    col1 int,

    col2 varchar(10),

    col3 smalldatetime)

    insert into #Source (col1, col2, col3)

    select 1, 'name1', '2010-01-01' union all

    select 1, 'name2', '2010-01-01' union all

    select 1, 'name3', '2010-01-02' union all

    select 1, 'name4', '2010-01-01'

    create table #Target (

    col1 int,

    col2 varchar(10),

    col3 smalldatetime)

    insert into #Target (col1, col2, col3)

    select 2, 'name1', '2010-01-01' union all

    select 2, 'name2', '2010-01-01' union all

    select 2, 'name3', '2010-01-02' union all

    select 2, 'name5', '2010-01-01' union all

    select 3, 'name1', '2010-01-01' union all

    select 3, 'name2', '2010-01-01' union all

    select 3, 'name3', '2010-01-02' union all

    select 4, 'name1', '2010-01-02' union all

    select 4, 'name2', '2010-01-01' union all

    select 4, 'name3', '2010-01-02' union all

    select 4, 'name4', '2010-01-01' union all

    select 5, 'name1', '2010-01-01' union all

    select 5, 'name2', '2010-01-01' union all

    select 5, 'name3', '2010-01-02' union all

    select 5, 'name4', '2010-01-01' union all

    select 5, 'name5', '2010-01-01' union all

    select 6, 'name1', '2010-01-01' union all

    select 6, 'name2', '2010-01-01' union all

    select 6, 'name3', '2010-01-02' union all

    select 6, 'name4', '2010-01-01'

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • -- Since there are no matches on col1 between the two tables,

    -- surely the starting point of the results table should reflect this?

    t.col1 s.col1

    2,NULL

    3,NULL

    4,NULL

    5,NULL

    6,NULL

    NULL,1

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • I came up to the following, based on your logic

    select t2.mytab2col1,t1.mytab1col1

    from ##Temp2 t2

    join ##Temp1 t1 on

    ((t2.mytab2col2 != t1.mytab1col2) and (t2.mytab2col3 = t1.mytab1col3))

    or ((t2.mytab2col2 = t1.mytab1col2) and (t2.mytab2col3 != t1.mytab1col3))

    group by t2.mytab2col1,t1.mytab1col1

    union all

    select ##Temp2.mytab2col1,##Temp1.mytab1col1

    from ##Temp2,##Temp1

    where not exists (select 1 from ##Temp1 where ##Temp1.mytab1col2 = ##Temp2.mytab2col2)

    group by ##Temp2.mytab2col1,##Temp1.mytab1col1

    Regards

    IgorMi

    Igor Micev,My blog: www.igormicev.com

  • I came up to the following, based on your logic

    select t2.mytab2col1,t1.mytab1col1

    from ##Temp2 t2

    join ##Temp1 t1 on

    ((t2.mytab2col2 != t1.mytab1col2) and (t2.mytab2col3 = t1.mytab1col3))

    or ((t2.mytab2col2 = t1.mytab1col2) and (t2.mytab2col3 != t1.mytab1col3))

    group by t2.mytab2col1,t1.mytab1col1

    union all

    select ##Temp2.mytab2col1,##Temp1.mytab1col1

    from ##Temp2,##Temp1

    where not exists (select 1 from ##Temp1 where ##Temp1.mytab1col2 = ##Temp2.mytab2col2)

    group by ##Temp2.mytab2col1,##Temp1.mytab1col1

    Regards

    IgorMi

    Igor Micev,My blog: www.igormicev.com

  • One more way to do this..

    ; WITH cte_Temp1 AS

    (

    SELECTT1.*, T2.mytab1col1 AS mytab1col1_T2

    FROM##Temp1 AS T1

    CROSS JOIN(

    SELECTDISTINCT T2.mytab1col1

    FROM##Temp2 AS T2

    ) AS T2

    )

    SELECTT1.mytab1col1, T1.mytab1col1_T2

    FROMcte_Temp1 AS T1

    LEFT OUTER JOIN##Temp2 AS T2 ON T1.mytab1col1_T2 = T2.mytab1col1 AND T1.mytab1col2 = T2.mytab1col2 AND T1.mytab1col3 = T2.mytab1col3

    WHERET2.myid IS NULL

    UNION ALL

    SELECTT1.mytab1col1, COALESCE(T1.mytab1col1_T2,T2.mytab1col1)

    FROMcte_Temp1 AS T1

    RIGHT OUTER JOIN ##Temp2 AS T2 ON T1.mytab1col1_T2 = T2.mytab1col1 AND T1.mytab1col2 = T2.mytab1col2

    WHERET1.myid IS NULL


    Kingston Dhasian

    How to post data/code on a forum to get the best help - Jeff Moden
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

  • Thank you so much Kingston... that works. I will be implementing your logic in my original script. I hope using CTE for millions of data is okay. Let you know the result soon.

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply