• Lowell - Wednesday, June 28, 2017 1:35 PM

    wouldn't it be better to calculate a checksum or binary checksum for the list of columns to test instead?
    the space used would be substantially smaller, right?, then you could use a temp table with those values, and a CTE with a rownumber to delete dupes?

    SELECT CHECKSUM(col1,col2,col46)
    FROM MyTable
    GROUP BY  CHECKSUM(col1,col2,col46)
    HAVING COUNT(*) > 1

    Lowell makes a good point.  I like the ROW_NUMBER approach myself.

    Another thing to keep in mind is the size of the log file.  The INSERTs in Mike's approach and the DELETEs in Lowell's approach are all fully logged operations.  If it were me, I'd figure out how many rows you're going to keep and delete and let that influence my decision.