eliminating duplicates

  • delete from afrom(select Emp_Name, Company, Join_Date, Resigned_Date

    ,ROW_NUMBER() over (partition by Emp_Name, Company, Join_Date

    ,Resigned_Date

    order by Emp_Name, Company, Join_Date

    ,Resigned_Date) RowNumber

    from Emp_Details) awhere a.RowNumber = 1

  • edaboin (9/22/2010)


    delete from afrom(select Emp_Name, Company, Join_Date, Resigned_Date

    ,ROW_NUMBER() over (partition by Emp_Name, Company, Join_Date

    ,Resigned_Date

    order by Emp_Name, Company, Join_Date

    ,Resigned_Date) RowNumber

    from Emp_Details) awhere a.RowNumber = 1

    This is a 2 year old thread. The solution you provided will not work here as this is a SQL Server 2000 forum and ROW_NUMBER is not supported in SS2000.

    John Rowan

    ======================================================
    ======================================================
    Forum Etiquette: How to post data/code on a forum to get the best help[/url] - by Jeff Moden

  • I've had a similar situation in the past and resolved by using a CTE and row ranking. I've created a test table similar to the one provided that has the following data:

    ABC24M

    ABC24M

    ABC24M

    LMN27M

    LMN27M

    LMN27M

    LMN27M

    PQRS25F

    XYZ24M

    XYZ25M

    Use a CTE and row ranking, you can determine your duplicate sets:

    WITH a AS (

    SELECT NAME,age,Sex,ROW_NUMBER() OVER (PARTITION BY NAME ORDER BY age desc) rnk

    FROM dbo.TAB)

    SELECT * FROM a;

    Which produces the following:

    ABC24M1

    ABC24M2

    ABC24M3

    LMN27M1

    LMN27M2

    LMN27M3

    LMN27M4

    PQRS25F1

    XYZ24M1

    XYZ25M2

    I'm not sure how you determine which row you would like to delete, but in my case I was able to delete any row that was greater than 1:

    WITH a AS (

    SELECT NAME,age,Sex,ROW_NUMBER() OVER (PARTITION BY NAME ORDER BY age asc) rnk

    FROM dbo.TAB)

    delete FROM a WHERE a.rnk > 1;

    which produces

    ABC24M

    LMN27M

    PQRS25F

    XYZ24M

    If you just want to delete the first row in each set that contains more than one entry, you would have to do something like the following:

    WITH a AS (

    SELECT NAME,age,Sex,ROW_NUMBER() OVER (PARTITION BY NAME ORDER BY age asc) rnk

    FROM dbo.TAB)

    delete FROM a WHERE a.rnk = 1 AND a.NAME IN

    (SELECT NAME FROM a WHERE a.rnk > 1);

    which produces:

    ABC24M

    ABC24M

    LMN27M

    LMN27M

    LMN27M

    PQRS25F

    XYZ25M

    Hope this helps!

    Note: I did not see this was sql 2000 or the thread was three years old!

Viewing 3 posts - 136 through 137 (of 137 total)

You must be logged in to reply to this topic. Login to reply