RE: Remove Duplicates from a Table using specific columns

SSC Eights!

Points: 938

March 24, 2013 at 9:14 am

if you're working on a huge amount of data I think the most optimized way would be to use the row)number function. This allows you to partition the values you need to determine to be unique.

In the query I have written below you can see that by partitioning you are assigning unique values to those columns in the partition. If you come across two values that are the same the partition will assign a new number for that value. The outer query where clause is important in this case because by choosing one you are choosing to only select those that are non duplicated.

njoy!

CREATE TABLE #temp

(

Col1 varchar(100),

Col2 varchar(100) ,

Col3 varchar(100) ,

Col4 varchar(100)

)

INSERT INTO #temp

VALUES ('A', 'Y', '1', 'Z'),

('B', 'Y', '2', 'Q'),

('C', 'Y', '3', 'R'),

('A', 'Y', '4', 'Z'),

('A', 'Y', '5', 'Z')

SELECT Col1,

Col2,

Col3,

Col4 FROM (

select ROW_NUMBER()OVER (PARTITION BY Col1 order by COL1) ROWID,

Col1,

Col2,

Col3,

Col4

FROM #temp)LL

where ROWID = 1