need technique to find duplicates which are causing PK constraint failure during merge

Question

need technique to find duplicates which are causing PK constraint failure during merge

KoldCoffee

SSCoach

Points: 18971
More actions
July 22, 2011 at 11:32 pm

#254840

I am merging two tables:
TBLANIMAL (as T)
animal varchar(40)
haveit bit
name varchar(30)
ANIMAL (as A)
animal varchar(40)
haveit bit
name varchar(30) Primary Key
I am merging on T.animal = A.animal. The source table (T) has many duplicates in the name column, that can't be brought over to the target table (A) because it's name column is protected with a PK.
I need to get rid of all potential duplicates so the merge can go through. What technique can I use to find all the rows, which satisfy the T.animal = A.animal condition and have duplicate name values in the source table . . . and get rid of them?

Viewing 7 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply

crazy4sql SSCoach Points: 19590 More actions · Answer 1

I need to get rid of all potential duplicates so the merge can go through. What technique can I use to find all the rows, which satisfy the T.animal = A.animal condition and have duplicate name values in the source table . . . and get rid of them?

select name,COUNT(name) from sourcetable group by name having COUNT(*)>1

This will give you the number of count with the name of those record which have more than one entry.

then you can delete the duplicate record by leaving just one record of those duplicates.

hope this help.

----------
Ashish

KoldCoffee SSCoach Points: 18971 More actions · Answer 2

thank you.

Your solution is similar to one I tried to follow here: I have used this and found duplicates. however, there are so many, I don't know how to get rid of all of them. eg. some cases 126 rows with duplicates (and I am sure I want to delete, at this point).

Do you have a technique for deleting duplicates?

crazy4sql SSCoach Points: 19590 More actions · Answer 3

When its coding, i dont hit my head in it. I simply use google for it and then customise the solution as per requirement.

this article will give you clue -- http://support.microsoft.com/kb/139444

or this approach is also good to try as written by Pinal

http://blog.sqlauthority.com/2007/03/01/sql-server-delete-duplicate-records-rows/

----------
Ashish

KoldCoffee SSCoach Points: 18971 More actions · Answer 4

thanks. yes, the first link to msft is the one I was referring to. Thanks for Pinal's, I'll take a look.

Ron McCullough SSC Guru Points: 63877 More actions · Answer 5

You may want to use the MERGE statement, for example:

CREATE TABLE TBLANIMAL(animal vaINSERT INTO TBLANIMAL

SELECT 'Horse',0,'Giddyup' UNION ALL

SELECT 'Elephant',1,'Bigone' UNION ALL

SELECT 'Horse',0,'Giddyup' UNION ALL

SELECT 'Elephant',1,'Bigone' UNION ALL

SELECT 'Goat',0,'no name'

CREATE TABLE ANIMAL(animal varchar(40),haveit bit,name varchar(30))

Then use the MERGE statement:

SET NOCOUNT ON;

MERGE Animal AS target

USING (SELECT DISTINCT(Name),Animal,haveit FROM TBLAnimal) AS source (Name,Animal,haveit)

ON (target.Animal = Source.Animal AND target.haveit = Source.haveit AND Target.name = source.name)

WHEN NOT MATCHED THEN

INSERT (Animal,haveit,name)

VALUES (Source.Animal,Source.haveit,Source.name);

Result:

SELECT * FROM Animal

animal haveit name

Elephant1 Bigone

Goat 0 no name

Horse 0 Giddyup

Edited to add:

You might want to take the time to watch this short video on the MERGE statement:

http://www.sqlshare.com/player.aspx?vid=513

If everything seems to be going well, you have obviously overlooked something.

Ron

Please help us, help you -before posting a question please read[/url]
Before posting a performance problem please read[/url]

Jeff Moden SSC Guru Points: 1004654 More actions · Answer 6

hxkresl (7/22/2011)
I am merging two tables:
TBLANIMAL (as T)
animal varchar(40)
haveit bit
name varchar(30)
ANIMAL (as A)
animal varchar(40)
haveit bit
name varchar(30) Primary Key
I am merging on T.animal = A.animal. The source table (T) has many duplicates in the name column, that can't be brought over to the target table (A) because it's name column is protected with a PK.
I need to get rid of all potential duplicates so the merge can go through. What technique can I use to find all the rows, which satisfy the T.animal = A.animal condition and have duplicate name values in the source table . . . and get rid of them?

IIRC, doesn't 2008 have a MERGE command that would easily take care of this for you?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)