RE: Rank duplicates, but only rows involved in duplicates

SSC Guru

Points: 100947

October 22, 2014 at 3:49 pm

jeff.born (10/22/2014)
The solution with the min and max would work if there were only duplicates, but Businesses have been known to be duplicated 100s of times due to previous incorrectly coded merging and matching of data.
But I'm still hoping for something that simple!

Sorry, my intent wasn't to limit to just the max, but to have that value available to use in a key range when deleting the dups, to reduce how many of the 22M rows you had to search.

I.e.:

DELETE FROM mt

FROM <main_table> mt

INNER JOIN <dups_table> dt ON

dt.name = mt.name and

dt.phone = mt.phone and ...

WHERE

mt.pk > dt.min_pk AND

mt.pk <= dt.max_pk

SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.