Compare data between rows of same table

Question

Post reply

Compare data between rows of same table

iam777here4u

Ten Centuries

Points: 1057
More actions
April 24, 2013 at 10:38 pm

#292439

create table ##Temp1 (
myid int identity (1, 1),
mytab1col1 int,
mytab1col2 varchar(10),
mytab1col3 smalldatetime)
insert into ##Temp1 (mytab1col1, mytab1col2, mytab1col3)
select 1, 'name1', '2010-01-01'
union all
select 1, 'name2', '2010-01-01'
union all
select 1, 'name3', '2010-01-02'
union all
select 1, 'name4', '2010-01-01'
create table ##Temp2 (
myid int identity (1, 1),
mytab1col1 int,
mytab1col2 varchar(10),
mytab1col3 smalldatetime)
insert into ##Temp2 (mytab1col1, mytab1col2, mytab1col3)
select 2, 'name1', '2010-01-01'
union all
select 2, 'name2', '2010-01-01'
union all
select 2, 'name3', '2010-01-02'
union all
select 2, 'name5', '2010-01-01'
union all
select 3, 'name1', '2010-01-01'
union all
select 3, 'name2', '2010-01-01'
union all
select 3, 'name3', '2010-01-02'
union all
select 4, 'name1', '2010-01-02'
union all
select 4, 'name2', '2010-01-01'
union all
select 4, 'name3', '2010-01-02'
union all
select 4, 'name4', '2010-01-01'
--Result should be
Select1, 2
union all
SelectNULL, 2
union all
Select1, 3
union all
Select1, 4
drop table ##Temp1
drop table ##Temp2
go
Dear friends,
I have a requirement n project to compare data of a table for reporting purpose. The above sample code will give you an idea about what i need. My table contains an identity column and a foreign key. I will get a foreign key as a source id and this id may have multiple records in the table. I take this source records (in above case records in ##Temp1) in a temporary table and i have to compare this source records with other records in original table (in above case records in ##Temp2). In the sample I gave, mentioned the result I am expecting.
The first record in result is generated because source table have 'name4' with date and which is not available in target table for id 2.
The second record in result is generated because target table have 'name5' and which is not available in source table.
The third record in result is generated because source table have 'name4' and which is not available in target table for id 3.
The fourth record in result is generated because target table have 'name1' with date '2010-01-02' and which is mismatched with record in source table.
Could you help me to create a script for generating this output? Please give me an idea. Right now I have is a very complex script contains left join which will not give the exact result.

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply

Igor Micev SSC-Dedicated Points: 33109 More actions · Answer 1

Hi

Simply use EXCEPT.

The following

select col1,col2,...,colN from table1

EXCEPT

select col1,col2,...,colN from table2

will give you the different rows.

Another option is to use some tool for data/base comparison

Regards

IgorMi

Igor Micev,My blog: www.igormicev.com

iam777here4u Ten Centuries Points: 1057 More actions · Answer 2

Since I have different ID's and I need them to be displayed in result and I don't want to compare the ID's but all other columns, EXCEPT will not work.

Igor Micev SSC-Dedicated Points: 33109 More actions · Answer 3

Maybe you should do a little work around.

What about this?

select t1.mytab1col1,t2.mytab1col1

from ##Temp1 t1 left join ##Temp2 t2 on t1.mytab1col2=t2.mytab1col2

and t1.mytab1col3=t2.mytab1col3 and t1.mytab1col1 != t2.mytab1col1

group by t1.mytab1col1,t2.mytab1col1

order by t1.mytab1col1 asc

Igor Micev,My blog: www.igormicev.com

iam777here4u Ten Centuries Points: 1057 More actions · Answer 4

That query is having a problem, lets say the records with id value 4 in column mytab1col1 doesn't have any mismatch. The result still displays id 4, which it should not because the requirement is to display only mismatched records.

Igor Micev SSC-Dedicated Points: 33109 More actions · Answer 5

iam777here4u (4/25/2013)
That query is having a problem, lets say the records with id value 4 in column mytab1col1 doesn't have any mismatch. The result still displays id 4, which it should not because the requirement is to display only mismatched records.

You've written

--Result should be

Select1, 2

union all

SelectNULL, 2

union all

Select1, 3

union all

Select1, 4

Igor Micev,My blog: www.igormicev.com

iam777here4u Ten Centuries Points: 1057 More actions · Answer 6

That's true and it is because there was below mismatch with respect to Id 4.

##Temp1 contains

select 1, 'name1', '2010-01-01'

##Temp2 contains

select 4, 'name1', '2010-01-02' --> The eighth record in original script: date mismatch

Lets say both have date '2010-01-01' which means no mismatch with ##Temp1 and ##Temp2 records with respect to ID 4 (only ID differ which we are not comparing). This time 4 should not be displayed

I was working on this morning onwards and what I did is below:

select *

from ##Temp2 t2

left join ##Temp1 t1

on t1.mytab1col2=t2.mytab1col2

and t1.mytab1col3=t2.mytab1col3

where t1.mytab1col1 is null

order by t1.mytab1col1

Other than this script, I used another script which will compare the count(*) group by mytab1col1 to get the mismatch in counts. But this seems complex to me.

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 7

iam777here4u (4/25/2013)
That query is having a problem, lets say the records with id value 4 in column mytab1col1 doesn't have any mismatch. The result still displays id 4, which it should not because the requirement is to display only mismatched records.

Please review your expected output table carefully. There are no column names, which would be useful. In the explanation for the four rows shown you refer to column ID, but elsewhere in your post, the column ID is irrelevant to the exercise. A precise definition will reward you with a precise answer.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

iam777here4u Ten Centuries Points: 1057 More actions · Answer 8

Sorry for the trouble. I hope below code will give better idea.

create table ##Temp1 (

mytab1col1 int,

mytab1col2 varchar(10),

mytab1col3 smalldatetime)

insert into ##Temp1 (mytab1col1, mytab1col2, mytab1col3)

select 1, 'name1', '2010-01-01'

union all

select 1, 'name2', '2010-01-01'

union all

select 1, 'name3', '2010-01-02'

union all

select 1, 'name4', '2010-01-01'

create table ##Temp2 (

mytab2col1 int,

mytab2col2 varchar(10),

mytab2col3 smalldatetime)

insert into ##Temp2 (mytab2col1, mytab2col2, mytab2col3)

select 2, 'name1', '2010-01-01'

union all

select 2, 'name2', '2010-01-01'

union all

select 2, 'name3', '2010-01-02'

union all

select 2, 'name5', '2010-01-01'

union all

select 3, 'name1', '2010-01-01'

union all

select 3, 'name2', '2010-01-01'

union all

select 3, 'name3', '2010-01-02'

union all

select 4, 'name1', '2010-01-02'

union all

select 4, 'name2', '2010-01-01'

union all

select 4, 'name3', '2010-01-02'

union all

select 4, 'name4', '2010-01-01'

union all

select 5, 'name1', '2010-01-01'

union all

select 5, 'name2', '2010-01-01'

union all

select 5, 'name3', '2010-01-02'

union all

select 5, 'name4', '2010-01-01'

union all

select 5, 'name5', '2010-01-01'

union all

select 6, 'name1', '2010-01-01'

union all

select 6, 'name2', '2010-01-01'

union all

select 6, 'name3', '2010-01-02'

union all

select 6, 'name4', '2010-01-01'

-- Result should be

mytab2col1 mytab1col1

----------- ----------

2 1 --> Id 2 displayed because it is missing name4

2 1 --> Id 2 displayed again because it is having extra record name5

3 1 --> Id 3 displayed because it is missing name4

4 1 --> Id 4 displayed because it is having mismatch in date (please note the first record of ID 4)

5 1 --> Id 5 displayed because it is having extra record name5

--> Id 6 not displayed as no mismatch is occured

--Please note that i want to compare records with all other columns except ID. The ID column should be displayed as result

drop table ##Temp1

drop table ##Temp2

go

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 9

iam777here4u (4/25/2013)
Sorry for the trouble. I hope below code will give better idea....

No worries. I like this better, can we use it? I think folks will find it easier to relate to and to discuss, and it removes those global temp tables too, which are rarely a good idea:

create table #Source (

col1 int,

col2 varchar(10),

col3 smalldatetime)

insert into #Source (col1, col2, col3)

select 1, 'name1', '2010-01-01' union all

select 1, 'name2', '2010-01-01' union all

select 1, 'name3', '2010-01-02' union all

select 1, 'name4', '2010-01-01'

create table #Target (

col1 int,

col2 varchar(10),

col3 smalldatetime)

insert into #Target (col1, col2, col3)

select 2, 'name1', '2010-01-01' union all

select 2, 'name2', '2010-01-01' union all

select 2, 'name3', '2010-01-02' union all

select 2, 'name5', '2010-01-01' union all

select 3, 'name1', '2010-01-01' union all

select 3, 'name2', '2010-01-01' union all

select 3, 'name3', '2010-01-02' union all

select 4, 'name1', '2010-01-02' union all

select 4, 'name2', '2010-01-01' union all

select 4, 'name3', '2010-01-02' union all

select 4, 'name4', '2010-01-01' union all

select 5, 'name1', '2010-01-01' union all

select 5, 'name2', '2010-01-01' union all

select 5, 'name3', '2010-01-02' union all

select 5, 'name4', '2010-01-01' union all

select 5, 'name5', '2010-01-01' union all

select 6, 'name1', '2010-01-01' union all

select 6, 'name2', '2010-01-01' union all

select 6, 'name3', '2010-01-02' union all

select 6, 'name4', '2010-01-01'

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 10

-- Since there are no matches on col1 between the two tables,

-- surely the starting point of the results table should reflect this?

t.col1 s.col1

2,NULL

3,NULL

4,NULL

5,NULL

6,NULL

NULL,1

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Igor Micev SSC-Dedicated Points: 33109 More actions · Answer 11

I came up to the following, based on your logic

select t2.mytab2col1,t1.mytab1col1

from ##Temp2 t2

join ##Temp1 t1 on

((t2.mytab2col2 != t1.mytab1col2) and (t2.mytab2col3 = t1.mytab1col3))

or ((t2.mytab2col2 = t1.mytab1col2) and (t2.mytab2col3 != t1.mytab1col3))

group by t2.mytab2col1,t1.mytab1col1

union all

select ##Temp2.mytab2col1,##Temp1.mytab1col1

from ##Temp2,##Temp1

where not exists (select 1 from ##Temp1 where ##Temp1.mytab1col2 = ##Temp2.mytab2col2)

group by ##Temp2.mytab2col1,##Temp1.mytab1col1

Regards

IgorMi

Igor Micev,My blog: www.igormicev.com

Igor Micev SSC-Dedicated Points: 33109 More actions · Answer 12

I came up to the following, based on your logic

select t2.mytab2col1,t1.mytab1col1

from ##Temp2 t2

join ##Temp1 t1 on

((t2.mytab2col2 != t1.mytab1col2) and (t2.mytab2col3 = t1.mytab1col3))

or ((t2.mytab2col2 = t1.mytab1col2) and (t2.mytab2col3 != t1.mytab1col3))

group by t2.mytab2col1,t1.mytab1col1

union all

select ##Temp2.mytab2col1,##Temp1.mytab1col1

from ##Temp2,##Temp1

where not exists (select 1 from ##Temp1 where ##Temp1.mytab1col2 = ##Temp2.mytab2col2)

group by ##Temp2.mytab2col1,##Temp1.mytab1col1

Regards

IgorMi

Igor Micev,My blog: www.igormicev.com

Kingston Dhasian SSCoach Points: 19799 More actions · Answer 13

One more way to do this..

; WITH cte_Temp1 AS

(

SELECTT1.*, T2.mytab1col1 AS mytab1col1_T2

FROM##Temp1 AS T1

CROSS JOIN(

SELECTDISTINCT T2.mytab1col1

FROM##Temp2 AS T2

) AS T2

)

SELECTT1.mytab1col1, T1.mytab1col1_T2

FROMcte_Temp1 AS T1

LEFT OUTER JOIN##Temp2 AS T2 ON T1.mytab1col1_T2 = T2.mytab1col1 AND T1.mytab1col2 = T2.mytab1col2 AND T1.mytab1col3 = T2.mytab1col3

WHERET2.myid IS NULL

UNION ALL

SELECTT1.mytab1col1, COALESCE(T1.mytab1col1_T2,T2.mytab1col1)

FROMcte_Temp1 AS T1

RIGHT OUTER JOIN ##Temp2 AS T2 ON T1.mytab1col1_T2 = T2.mytab1col1 AND T1.mytab1col2 = T2.mytab1col2

WHERET1.myid IS NULL

Kingston Dhasian

How to post data/code on a forum to get the best help - Jeff Moden
http://www.sqlservercentral.com/articles/Best+Practices/61537/

iam777here4u Ten Centuries Points: 1057 More actions · Answer 14

Thank you so much Kingston... that works. I will be implementing your logic in my original script. I hope using CTE for millions of data is okay. Let you know the result soon.