Home Forums SQL Server 2008 T-SQL (SS2K8) Remove Duplicates from a Table using specific columns RE: Remove Duplicates from a Table using specific columns

  • if you're working on a huge amount of data I think the most optimized way would be to use the row)number function. This allows you to partition the values you need to determine to be unique.

    In the query I have written below you can see that by partitioning you are assigning unique values to those columns in the partition. If you come across two values that are the same the partition will assign a new number for that value. The outer query where clause is important in this case because by choosing one you are choosing to only select those that are non duplicated.

    njoy!

    CREATE TABLE #temp

    (

    Col1 varchar(100),

    Col2 varchar(100) ,

    Col3 varchar(100) ,

    Col4 varchar(100)

    )

    INSERT INTO #temp

    VALUES ('A', 'Y', '1', 'Z'),

    ('B', 'Y', '2', 'Q'),

    ('C', 'Y', '3', 'R'),

    ('A', 'Y', '4', 'Z'),

    ('A', 'Y', '5', 'Z')

    SELECT Col1,

    Col2,

    Col3,

    Col4 FROM (

    select ROW_NUMBER()OVER (PARTITION BY Col1 order by COL1) ROWID,

    Col1,

    Col2,

    Col3,

    Col4

    FROM #temp)LL

    where ROWID = 1