The following was sent to me by my friend and colleague Dave Miller:
Wanted to pass along something I hadn't used before and found useful to easily get rid of duplicates in a set of data. The functionality has existed in the SQL language and was supported in SQL Server 2005. This uses Common Table Expressions (CTE) and the ROW_NUMBER() function.
The PARTITION BY portion of the statment specifies when to reset the row number, in my example I had:
The following query will remove the duplicates:
WITH PersonPhones AS (
SELECT ROW_NUMBER() OVER (PARTITION BY PersonId, PhonetypeId, PhoneNumber ORDER BY PersonId, PhonetypeId, PhoneNumber) AS GroupIndex,
FROM Person.Phone pp
DELETE FROM PersonPhones WHERE GroupIndex > 1
The CTE would create a table with the following data, where anything that is a duplicate has a GroupIndexId > 1
This method could be used to get the nth item of a group also.