Yet Another Deleting Duplicate Records (in-place)

, 2009-01-15 (first published: )

Here is another "deleting duplicate records" script, this time in-place. It exploits the ability to relatively easily add and drop a uniqueidentifier column to a table. The term "must-be-unique columns" refers to the set of columns that have to have a unique combination of values (natural primary key candidate for the table). The DELETE statement is quite generic and the comments suggest the changes needed for any other table. In addition, you obviously have to replace "MyTab" with the name of your table.

Note: The "DELETE" statement is designed and formatted in order to extend its usage - replacing "DELETE t" with "SELECT t.*" will return the records that will be deleted.

USE tempdb;
GO

CREATE TABLE MyTab (Id1 int, Id2 varchar(1));
GO

INSERT MyTab
SELECT 1, 'A' UNION ALL
SELECT 1, 'B' UNION ALL
SELECT 1, 'B' UNION ALL
SELECT 1, 'C' UNION ALL
SELECT 1, 'D' UNION ALL
SELECT 2, 'A' UNION ALL
SELECT 2, 'A' UNION ALL
SELECT 2, 'A' UNION ALL
SELECT 2, 'B' UNION ALL
SELECT 2, 'C'
GO

SELECT * FROM MyTab;
GO

ALTER TABLE MyTab
ADD rowguid uniqueidentifier NOT NULL
CONSTRAINT DF__MyTab__rowguid DEFAULT (NEWID());
GO

SELECT * FROM MyTab;
GO

DELETE t -- SELECT t.* 
FROM MyTab t -- your table name
JOIN (
 SELECT Id1, Id2 -- list of must-be-unique columns
 , MAX(CAST(rowguid AS varchar(100))) AS max_rowguid
 FROM MyTab -- your table name
 GROUP BY Id1, Id2 -- list of must-be-unique columns
 HAVING COUNT(*) > 1
) t1
 ON t.Id1=t1.Id1 -- add all must-be-unique columns 
 AND t.Id2=t1.Id2
WHERE t.rowguid <> t1.max_rowguid;
GO

ALTER TABLE MyTab DROP CONSTRAINT DF__MyTab__rowguid;
GO
ALTER TABLE MyTab DROP COLUMN rowguid;
GO

SELECT * FROM MyTab;
GO

DROP TABLE MyTab;
GO

Rate

4.67 (3)

Share

Share

Rate

4.67 (3)

Related content