RE: Removing Duplicates – SQLServerCentral

One Orange Chip

Points: 27770

January 28, 2015 at 9:27 am

hans.reidt (1/28/2015)
I do this on tables with > 1.000.000 records
Something to keep in mind of course, is the number / complexity of indexes.
A possibility is to disable all indexes, remove the duplicates and rebuild the indexes.
Of course, when you have duplicates in your data, the problem is usually the import.
That I try to solve with an LEFT OUTER JOIN on the unique columns.
This way I get idempotence, which is nice

Sure, I'm not saying it can't work. I'm saying that depending on your server's power & resources, that might not be a great option. Especially when you get to really big table, 10 million, 100 million or more. Or if you're doing this on a data import. And I don't think I'd want that code in any automated deduping procedures.

--------------------------------------
When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
--------------------------------------
It’s unpleasantly like being drunk.
What’s so unpleasant about being drunk?
You ask a glass of water. -- Douglas Adams