Using OVER() to Remove Duplicate Rows

  • Thanks for the comment pdanes.

    Indeed, as we all know there are several ways to achieve the same thing (as eduarrr and others have pointed out). There will always be views on what is "better". Glad the key point about the OVER clause resonated with you.

    Cheers,

    Mike.

  • Irfan.Baig (11/7/2016)


    Hello Dear Mike,

    I think we can also remove duplicates using simplest, easiest and better way without using temp table/s logic:

    WITH CTE_InsurancePolicy AS (

    SELECT PolicyId, FirstName, Surname, PolicyStartDate, PolicyEndDate

    ,ROW_NUMBER() OVER (PARTITION BY PolicyId, FirstName, Surname, PolicyStartDate, PolicyEndDate ORDER BY PolicyId) AS RowNumber

    FROM dbo.InsurancePolicy

    )

    DELETE CTE_InsurancePolicy

    WHERE RowNumber > 1;

    Thanks

    Irfan Baig.

    That's brilliant. I've never seen that before and I found a use for it immediately. I'd got a CTE to create the row numbers but I didn't know you could DELETE directly from it.


    On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.
    —Charles Babbage, Passages from the Life of a Philosopher

    How to post a question to get the most help http://www.sqlservercentral.com/articles/Best+Practices/61537

  • Yev.d (11/8/2016)


    Irfan.Baig (11/7/2016)


    I think we can also remove duplicates using simplest, easiest and better way without using temp table/s logic:

    I think the simplest in this case would be to truncate the table and re-run the import process.

    😉

    Simplest, yes. However, my pride is WAY to much in the way and I would definitely not go backward by starting over.:hehe:

    My wife could share stories of long(er) car rides (before GPS's) because there was no way I was going to turn back and go the way we just came. There's another way...maybe even a shorter way...somewhere. lol

  • Whoa, I didn't realize either that deleting from the CTE could delete from the original table. And you can insert/update too. Glad I didn't learn that lesson at an inopportune time!

    This method of duplicate deletion sort of imposes a PK where there is none.

    Ken

    BWFC (11/10/2016)


    Irfan.Baig (11/7/2016)


    Hello Dear Mike,

    I think we can also remove duplicates using simplest, easiest and better way without using temp table/s logic:

    WITH CTE_InsurancePolicy AS (

    SELECT PolicyId, FirstName, Surname, PolicyStartDate, PolicyEndDate

    ,ROW_NUMBER() OVER (PARTITION BY PolicyId, FirstName, Surname, PolicyStartDate, PolicyEndDate ORDER BY PolicyId) AS RowNumber

    FROM dbo.InsurancePolicy

    )

    DELETE CTE_InsurancePolicy

    WHERE RowNumber > 1;

    Thanks

    Irfan Baig.

    That's brilliant. I've never seen that before and I found a use for it immediately. I'd got a CTE to create the row numbers but I didn't know you could DELETE directly from it.

  • Good reminder how OVER works, thanks. I don't use it enough but when I do it can be a lifesaver.

  • I've used over to create row numbers and to get the a certain row (example second row) within a group. However for getting distinct, this seems like overkill. Why not just use

    select PolicyId, FirstName, Surname PolicyStartDate, PolicyEndDate

    from dbo.InsurancePolicy

    group by PolicyId, FirstName, Surname PolicyStartDate, PolicyEndDate

    order by PolicyId

Viewing 6 posts - 16 through 20 (of 20 total)

You must be logged in to reply to this topic. Login to reply