Query Help - For Duplicates

Question

Query Help - For Duplicates

inayatkhan

SSC Eights!

Points: 844
More actions
April 27, 2017 at 7:33 am

#321092

Hi
I need to find duplicate AID will be unique base on Name so I need to find duplicates on KeyValue column Group by AID.
so it means in this dataset AID 1 is duplicate for AID 4 as both matching so my expected result is to get AID 1 and 4.
create table #temp(AID int, Name varchar(500), KeyValue varchar(500))
insert #temp values (1, 'LN', 'CP_VA_LA') insert #temp values (1, 'LT', 'C1') insert #temp values (2, 'LN', 'CP_SA_LA') insert #temp values (2, 'LT', 'C1') insert #temp values (3, 'LN', 'CP_SA_LA') insert #temp values (3, 'LT', 'C2') insert #temp values (4, 'LN', 'CP_VA_LA') insert #temp values (4, 'LT', 'C1') insert #temp values (5, 'LN', 'CP_VA_DA') insert #temp values (5, 'LT', 'C4')
Thanks

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply

Alan Burstein SSC Guru Points: 61136 More actions · Answer 1

This can be improved but here's one way:
WITH piv AS ( SELECT AID, name1 = MAX(CASE [name] WHEN 'LN' THEN [name] END), name2 = MAX(CASE [name] WHEN 'LT' THEN [name] END), KeyValue1 = MAX(CASE [name] WHEN 'LT' THEN KeyValue END), KeyValue2 = MAX(CASE [name] WHEN 'LN' THEN KeyValue END) FROM #temp GROUP BY AID ), findDupes AS ( SELECT piv.name1 ,piv.name2 , piv.KeyValue1 ,piv.KeyValue2 FROM piv GROUP BY piv.name1, piv.name2, piv.KeyValue1, piv.KeyValue2 HAVING COUNT(*) > 1 ) SELECT piv.AID FROM findDupes d JOIN piv ON d.name1 = piv.name1 AND d.name2 = piv.name2 AND d.KeyValue1 = piv.KeyValue1 AND d.KeyValue2 = piv.KeyValue2;

"I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

-- Itzik Ben-Gan 2001

inayatkhan SSC Eights! Points: 844 More actions · Answer 2

Issue with this approach is I have big data set and some time I have two type of Name like (LN, LT) and some time 3 (LN, LT, LC) and some time 1 type of Name.only (LN) so in these cases its not work for all scenarios.

Thanks

Alan Burstein SSC Guru Points: 61136 More actions · Answer 3

inayatkhan - Thursday, April 27, 2017 8:07 AM
Issue with this approach is I have big data set and some time I have two type of Name like (LN, LT) and some time 3 (LN, LT, LC) and some time 1 type of Name.only (LN) so in these cases its not work for all scenarios.
Thanks

Ok, how about this

IF OBJECT_ID('tempdb..#temp') IS NOT NULL DROP TABLE #temp; GO create table #temp(AID int, Name varchar(500), KeyValue varchar(500)); GO

INSERT #temp values (1, 'LN', 'CP_VA_LA') ,(1, 'LT', 'C1') ,(2, 'LN', 'CP_SA_LA') ,(2, 'LT', 'C1') ,(3, 'LN', 'CP_SA_LA') ,(3, 'LT', 'C2') ,(4, 'LN', 'CP_VA_LA') ,(4, 'LT', 'C1') ,(5, 'LN', 'CP_VA_DA') ,(5, 'LT', 'C4') ,(5, 'LC', 'XXXA') ,(6, 'LN', 'CP_VA_DA') ,(6, 'LT', 'C4') ,(11, 'LN', 'CP_ZA_LA') ,(11, 'LT', 'C1') ,(14, 'LN', 'CP_ZA_LA') ,(14, 'LT', 'C1') ,(14, 'LC', 'XXXA') ,(21, 'LN', 'CP_ZA_LA') ,(21, 'LT', 'C5') ,(21, 'LC', 'XXXA') ,(24, 'LN', 'CP_ZA_LA') ,(24, 'LT', 'C5') ,(24, 'LC', 'XXXA') ; GO

WITH piv AS ( SELECT AID, name1 = MAX(CASE [name] WHEN 'LN' THEN [name] END), name2 = MAX(CASE [name] WHEN 'LT' THEN [name] END), name3 = MAX(CASE [name] WHEN 'LC' THEN [name] END), KeyValue1 = MAX(CASE [name] WHEN 'LT' THEN KeyValue END), KeyValue2 = MAX(CASE [name] WHEN 'LN' THEN KeyValue END), KeyValue3 = MAX(CASE [name] WHEN 'LC' THEN KeyValue END) FROM #temp GROUP BY AID ), findDupes AS ( SELECT piv.name1, piv.name2, piv.name3, piv.KeyValue1, piv.KeyValue2, piv.KeyValue3 FROM piv GROUP BY piv.name1, piv.name2, piv.name3, piv.KeyValue1, piv.KeyValue2, piv.KeyValue3 HAVING COUNT(*) > 1 ) SELECT piv.* FROM findDupes d JOIN piv ON d.name1 = piv.name1 AND ISNULL(d.name2,'') = ISNULL(piv.name2,'') AND ISNULL(d.name3,'') = ISNULL(piv.name3,'') AND d.KeyValue1 = piv.KeyValue1 AND ISNULL(d.KeyValue2,'') = ISNULL(piv.KeyValue2,'') AND ISNULL(d.KeyValue3,'') = ISNULL(piv.KeyValue3,'');

Which returns:

"I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

-- Itzik Ben-Gan 2001

inayatkhan SSC Eights! Points: 844 More actions · Answer 4

inayatkhan

SSC Eights!

Points: 844

April 27, 2017 at 9:38 am

#1940243

thanks!

Alan Burstein SSC Guru Points: 61136 More actions · Answer 5

No problem!

"I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

-- Itzik Ben-Gan 2001