hans.reidt (1/28/2015)
I do this on tables with > 1.000.000 recordsSomething to keep in mind of course, is the number / complexity of indexes.
A possibility is to disable all indexes, remove the duplicates and rebuild the indexes.
Of course, when you have duplicates in your data, the problem is usually the import.
That I try to solve with an LEFT OUTER JOIN on the unique columns.
This way I get idempotence, which is nice
Sure, I'm not saying it can't work. I'm saying that depending on your server's power & resources, that might not be a great option. Especially when you get to really big table, 10 million, 100 million or more. Or if you're doing this on a data import. And I don't think I'd want that code in any automated deduping procedures.
--------------------------------------
When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
--------------------------------------
It’s unpleasantly like being drunk.
What’s so unpleasant about being drunk?
You ask a glass of water. -- Douglas Adams