SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Eliminating Duplicate Rows using The PARTITION BY clause


Eliminating Duplicate Rows using The PARTITION BY clause

Author
Message
patrickmcginnis59
patrickmcginnis59
SSChasing Mays
SSChasing Mays (645 reputation)SSChasing Mays (645 reputation)SSChasing Mays (645 reputation)SSChasing Mays (645 reputation)SSChasing Mays (645 reputation)SSChasing Mays (645 reputation)SSChasing Mays (645 reputation)SSChasing Mays (645 reputation)

Group: General Forum Members
Points: 645 Visits: 2333
David Lean (9/21/2010)
Nice article & Nice solution. But I would caution anyone about to blindly using this in production. It is only the 1st part of the process.

Problem 1: need to Fix Declarative Referential Integrity (DRI) for the rows deleted.
Invariably the duplicate rows you are about to remove will be referenced by the foreign keys in another table. When you delete these rows you must keep a copy of the Keys of the rows you deleted & map them to the key of the “duplicate” row you kept. That way you have some lookup table you can use to correct all the foreign key references.


If a foreign key referenced only one of the rows you are deleting, isn't that by definition not a duplicate? So then in that case can't you just delete with the unique key?
jswong05
jswong05
SSC Veteran
SSC Veteran (227 reputation)SSC Veteran (227 reputation)SSC Veteran (227 reputation)SSC Veteran (227 reputation)SSC Veteran (227 reputation)SSC Veteran (227 reputation)SSC Veteran (227 reputation)SSC Veteran (227 reputation)

Group: General Forum Members
Points: 227 Visits: 476
If a foreign key referenced only one of the rows you are deleting, isn't that by definition not a duplicate? So then in that case can't you just delete with the unique key?

He is saying if the table has a primary key and many columns but the data is already screwed up. Now they identify four other columns combination as unique but data shows duplicates by these four. (The four columns should have had a design of unique key to begin with)

Jason
http://dbace.us
:-P
prakashp 84206
prakashp 84206
Grasshopper
Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)

Group: General Forum Members
Points: 21 Visits: 37
Hi,
The output of my query contains duplicate records (i.e. column1 value is similar but other column's value different) as well as single record having column2 value blanks(-). now I want to select only one record out of duplicate records for which column2 values are present (eliminating records for which column2 is blanks(-) from duplicate records) and all other single records having column2 values blanks.

My query's Present Output
column1 column2 column3
abc xyz pqr
abc - klm
ijk - uvw
plo - ujn
yhk ttg wea
yhk - erf

Expected Output
column1 column2 column3
abc xyz pqr
ijk - uvw
plo - ujn
yhk ttg wea

Can someone help me. Thanks in advance.
ChrisM@Work
ChrisM@Work
SSCoach
SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)

Group: General Forum Members
Points: 16085 Visits: 19534
prakashp 84206 (6/1/2012)
Hi,
The output of my query contains duplicate records (i.e. column1 value is similar but other column's value different) as well as single record having column2 value blanks(-). now I want to select only one record out of duplicate records for which column2 values are present (eliminating records for which column2 is blanks(-) from duplicate records) and all other single records having column2 values blanks.

My query's Present Output
column1 column2 column3
abc xyz pqr
abc - klm
ijk - uvw
plo - ujn
yhk ttg wea
yhk - erf

Expected Output
column1 column2 column3
abc xyz pqr
ijk - uvw
plo - ujn
yhk ttg wea

Can someone help me. Thanks in advance.


Welcome aboard.
Read the link in my sig, this will assist you in describing your problem and encouraging others to help. Then start a new thread. It's likely that your problem is sufficiently different to the original on this thread that it's better separated from it.

“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Exploring Recursive CTEs by Example Dwain Camps
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search