How to skip duplicate record and keep on running until the end?

Question

How to skip duplicate record and keep on running until the end?

adonetok

SSCrazy Eights

Points: 8395
More actions
April 27, 2020 at 9:01 pm

#3746811

I used code below to insert data into a table from another table.
INSERT INTO tUser (Name,Email)
SELECT Name,Email
FROM tUSER_NEW T
WHERE T.EMAIL NOT IN (SELECT EMAIL FROM tUser)
But, this code will stop running once there is a duplicate record.
How to skip duplicate record and keep on running until the end?

Viewing 9 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply

Phil Parkin SSC Guru Points: 247201 More actions · Answer 1

Your code does not 'skip' anything, nor is there any concept of anything 'running until the end'. Can you state the problem more clearly? Feel free to add sample data if it helps make things more lucid.

Jeffrey Williams SSC Guru Points: 90351 More actions · Answer 2

Can you define what a 'duplicate' actually means? Is that the same Name,Email combination - or just the same name - or just the same email?

If the same combination identifies a 'duplicate' - then use DISTINCT on the query. If a duplicate is defined by either the same 'name' or the same 'email' only then use row_number() in a common-table expression and exclude from inserting any rows with a row number > 1.

Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

― Charles R. Swindoll

How to post questions to get better answers faster
Managing Transaction Logs

adonetok SSCrazy Eights Points: 8395 More actions · Answer 3

I set up primary key on email column.

I want to void inserting any duplicate record in email column.

Phil Parkin SSC Guru Points: 247201 More actions · Answer 4

This might work.

INSERT tUser (Name,Email)
SELECT DISTINCT Name,Email
FROM tUSER_NEW tnew
WHERE NOT EXISTS (SELECT * FROM tUser t where t.Email = tnew.Email)

But if the same e-mail exists in tUSER_New for multiple Name values, it will still error. If this is the case, you would need to refine the source query still further to eliminate all duplicate e-mails – this would require you to decide on which version of NAME is retained.

Atul DBA Hall of Fame Points: 3610 More actions · Answer 5

Put the duplicate email from Source table into a temporary table:

SELECT DISTINCT COUNT(0), email INTO #email FROM tUSER_NEW GROUP BY email HAVING COUNT(0) > 1

Keep Duplicate Records in another temporary table for your reference and analysis

SELECT name, email INTO #duplicate FROM tUSER_NEW WHERE email IN (SELECT email FROM #email)

Now Insert the data where Emails are not matching:

INSERT INTO tUser (Name,Email)

SELECT Name, Email

FROM tUSER_NEW t

WHERE t.email NOT IN (SELECT email from #email) AND t.email NOT IN (SELECT EMAIL FROM tUser)

Hope this will work.

Jeff Moden SSC Guru Points: 1004683 More actions · Answer 6

adonetok wrote:

I set up primary key on email column.
I want to void inserting any duplicate record in email column.

That creates an index. If you modify that index to include the IGNORE_DUP_KEY option, that should solve your problem. Warning: I've not tested it for performance but I can't seen that it should be.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

adonetok SSCrazy Eights Points: 8395 More actions · Answer 7

I tried Phil Parkin's way, it is working now.

Thank all of you.

Jeff Moden SSC Guru Points: 1004683 More actions · Answer 8

adonetok wrote:

I tried Phil Parkin's way, it is working now.
Thank all of you.

Cool. Thanks for the feedback. To be sure, though, do you understand how and why Phil's good code works so you can do it the next time something like this comes up? And, to be equally sure, not being snarky here. We aim to teach when we answer a post to help folks learn things.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)