Delete Duplicates

  • Jamie-2229

    SSCrazy Eights

    Points: 8151

    Comments posted to this topic are about the item Delete Duplicates


  • Manie Verster


    Points: 7022

    Grrrreat question, Excellent answer, 'nuff said:P:P:P:P:P:P

    :-PManie Verster
    South Africa

    I can do all things through Christ who strengthens me. - Holy Bible
    I am a man of fixed and unbending principles, the first of which is to be flexible at all times. - Everett Mckinley Dirkson (Well, I am trying. - Manie Verster)

  • Dugi


    Points: 17998

    I'm wrong here but this is not correct question;

    1 & 2 are the same ... with same condition! So the correct answer is 2 but the 1 is the same everything ... too stupid!

    3 is different!

    also with three options I do not have valid parse for T-SQL code!

    how can you explain it!?

    :hehe::w00t: :P:P:P:P:P


  • Dugi


    Points: 17998

    OHHH nooooooooooooooooo

    I work just from select statement not "with numbered"


    nononononoonononononononono I anaylized very well what happend but without this piece of code!



  • sho-467341

    Ten Centuries

    Points: 1062

    whats happened to sql - it used to be so simple 😉

  • sho-467341

    Ten Centuries

    Points: 1062

    dougi - are you using sql 2000 by any chance?

  • John Laskey

    Right there with Babe

    Points: 734

    I agree, this was an excellent question!

  • Ron McCullough

    SSC Guru

    Points: 63877

    Thanks for an Excellent question really rather simply once one understands how partitioning works........... learned a lot from it ...

    If everything seems to be going well, you have obviously overlooked something.


    Please help us, help you -before posting a question please read[/url]
    Before posting a performance problem please read[/url]

  • Patrick Cahill


    Points: 5626

    Gives me more stuff to figure out, thanks for the question.

  • Mauve


    Points: 11316

    A semicolon at the beginning of the statement is valid

    I'm seeing a lot of this lately. I.e., statements such as:

    ;WITH ...

    Note that the semicolon is NOT part of the WITH statement syntax. This issue only arises if the prior statement did not end with a semicolon (which it should). So the leading semicolon is terminating the prior statement.

    SQL Server is starting to enforce proper statement termination. IMO, long overdue. Get into the habit of properly terminating ALL of your T-SQL statements. Even Microsoft recommends using it. From BOL:

    Many code examples use a semicolon ( ; ) as a Transact-SQL statement terminator. Although the semicolon is not required, using it is considered a good practice.

    [font="Arial Narrow"](PHB) I think we should build an SQL database. (Dilbert) What color do you want that database? (PHB) I think mauve has the most RAM.[/font]

  • Q -631159

    Ten Centuries

    Points: 1077

    I agree with the others, great question!


    Please take a number. Now serving emergency 1,203,894

  • redrazorback

    SSC Veteran

    Points: 207

    This anwer works for the given data but wouldn't work if we inserted a row like:

    insert into #new(id,keyvalue) values (1,'bb')

    The stated answer assumes a duplicate is defined by the ID column and would delete the above row even thought it's unique in the table. Since no PK is defined on the table I think we must assume a duplicate is defined by multiple rows with identical values for every column. So if we include both ID and keyvalue columns in the over() clause we will delete only non-unique rows:

    with numbered as


    SELECT rowno=row_number() over (partition by ID, keyvalue order by ID, keyvalue),ID,keyvalue

    from #new


    delete from numbered where rowno>1

  • Yelena Varshal


    Points: 34215

    J is correct.

    As for SQL, the old one will work too for this particular case:

    declare @I int

    select @I = (select count(convert(varchar(2),ID)+Keyvalue) from #new

    group by (convert(varchar(2),ID)+Keyvalue)

    having count(convert(varchar(2),ID)+Keyvalue) >1)

    set @I = @I-1

    set rowcount @I

    delete #New where convert(varchar(2),ID)+Keyvalue


    (select convert(varchar(2),ID)+Keyvalue from #new

    group by (convert(varchar(2),ID)+Keyvalue)

    having count(convert(varchar(2),ID)+Keyvalue) >1)

    Regards,Yelena Varsha

  • Jamie-2229

    SSCrazy Eights

    Points: 8151

    So if we include both ID and keyvalue columns in the over() clause we will delete only non-unique rows:

    The angle you derive with the "Partition by" clause is literally a "partition" of a particular cluster of fields defined uniquely. If you prefer, then also, use only the "order by" portion of the clause and create a numeric cluster("PK" ID ) that is dynamic [rather than static].

    Select rowno=Row_Number() over(Order by mychoiceofID,2ndchoice,etcchoice), * from mytable.


  • Olga B

    Default port

    Points: 1499

    Excellent question. I learned something useful.

    One application of OVER that seems to have potential (note hedging 😉 ) is that it allows applying aggregate function to more than one different grouping in the same query. For instance,

    select customerID, productID, orderDate, orderAmount,

    CustomerTotal = sum(orderAmount) over (partition by customerID),

    ProductTotal = sum(orderAmount) over (partition by productID)

    from Orders

    Any thoughts on performance?

    Edited for clarity.

Viewing 15 posts - 1 through 15 (of 23 total)

You must be logged in to reply to this topic. Login to reply