Delete Duplicates

Question

Post reply

Delete Duplicates

Jamie-2229

SSCrazy Eights

Points: 8156
More actions
May 12, 2008 at 9:51 pm

#380457

Comments posted to this topic are about the item Delete Duplicates
Jamie

Viewing 15 posts - 1 through 15 (of 23 total)

You must be logged in to reply to this topic. Login to reply

Manie Verster SSCertifiable Points: 7152 More actions · Answer 1

Grrrreat question, Excellent answer, 'nuff said:P:P:P:P:P:P

:-PManie Verster
Developer
Johannesburg
South Africa

I can do all things through Christ who strengthens me. - Holy Bible
I am a man of fixed and unbending principles, the first of which is to be flexible at all times. - Everett Mckinley Dirkson (Well, I am trying. - Manie Verster)

Dugi SSCoach Points: 17998 More actions · Answer 2

I'm wrong here but this is not correct question;

1 & 2 are the same ... with same condition! So the correct answer is 2 but the 1 is the same everything ... too stupid!

3 is different!

also with three options I do not have valid parse for T-SQL code!

how can you explain it!?

:hehe::w00t: :P:P:P:P:P

============================================================
SELECT YOUR PROBLEM FROM SSC.com WHERE PROBLEM DESCRIPTION =
http://www.sqlservercentral.com/articles/Best+Practices/61537/[/url]

Dugi SSCoach Points: 17998 More actions · Answer 3

OHHH nooooooooooooooooo

I work just from select statement not "with numbered"

pthuuuu

nononononoonononononononono I anaylized very well what happend but without this piece of code!

:hehe:

============================================================
SELECT YOUR PROBLEM FROM SSC.com WHERE PROBLEM DESCRIPTION =
http://www.sqlservercentral.com/articles/Best+Practices/61537/[/url]

sho-467341 Ten Centuries Points: 1062 More actions · Answer 4

whats happened to sql - it used to be so simple 😉

sho-467341 Ten Centuries Points: 1062 More actions · Answer 5

sho-467341

Ten Centuries

Points: 1062

May 13, 2008 at 2:20 am

#814286

dougi - are you using sql 2000 by any chance?

John Laskey Right there with Babe Points: 734 More actions · Answer 6

John Laskey

Right there with Babe

Points: 734

May 13, 2008 at 6:21 am

#814386

I agree, this was an excellent question!

Ron McCullough SSC Guru Points: 63877 More actions · Answer 7

Thanks for an Excellent question really rather simply once one understands how partitioning works........... learned a lot from it ...

If everything seems to be going well, you have obviously overlooked something.

Ron

Please help us, help you -before posting a question please read[/url]
Before posting a performance problem please read[/url]

Patrick Cahill SSCertifiable Points: 6623 More actions · Answer 8

Gives me more stuff to figure out, thanks for the question.

Mauve SSChampion Points: 11316 More actions · Answer 9

A semicolon at the beginning of the statement is valid

I'm seeing a lot of this lately. I.e., statements such as:

;WITH ...

Note that the semicolon is NOT part of the WITH statement syntax. This issue only arises if the prior statement did not end with a semicolon (which it should). So the leading semicolon is terminating the prior statement.

SQL Server is starting to enforce proper statement termination. IMO, long overdue. Get into the habit of properly terminating ALL of your T-SQL statements. Even Microsoft recommends using it. From BOL:

Many code examples use a semicolon ( ; ) as a Transact-SQL statement terminator. Although the semicolon is not required, using it is considered a good practice.

[font="Arial Narrow"](PHB) I think we should build an SQL database. (Dilbert) What color do you want that database? (PHB) I think mauve has the most RAM.[/font]

Q -631159 Ten Centuries Points: 1077 More actions · Answer 10

I agree with the others, great question!

Q

Please take a number. Now serving emergency 1,203,894

redrazorback SSC Veteran Points: 216 More actions · Answer 11

This anwer works for the given data but wouldn't work if we inserted a row like:

insert into #new(id,keyvalue) values (1,'bb')

The stated answer assumes a duplicate is defined by the ID column and would delete the above row even thought it's unique in the table. Since no PK is defined on the table I think we must assume a duplicate is defined by multiple rows with identical values for every column. So if we include both ID and keyvalue columns in the over() clause we will delete only non-unique rows:

with numbered as

(

SELECT rowno=row_number() over (partition by ID, keyvalue order by ID, keyvalue),ID,keyvalue

from #new

)

delete from numbered where rowno>1

Yelena Varshal SSC-Dedicated Points: 34287 More actions · Answer 12

J is correct.

As for SQL, the old one will work too for this particular case:

declare @I int

select @I = (select count(convert(varchar(2),ID)+Keyvalue) from #new

group by (convert(varchar(2),ID)+Keyvalue)

having count(convert(varchar(2),ID)+Keyvalue) >1)

set @I = @I-1

set rowcount @I

delete #New where convert(varchar(2),ID)+Keyvalue

in

(select convert(varchar(2),ID)+Keyvalue from #new

group by (convert(varchar(2),ID)+Keyvalue)

having count(convert(varchar(2),ID)+Keyvalue) >1)

Regards,Yelena Varsha

Jamie-2229 SSCrazy Eights Points: 8156 More actions · Answer 13

So if we include both ID and keyvalue columns in the over() clause we will delete only non-unique rows:

The angle you derive with the "Partition by" clause is literally a "partition" of a particular cluster of fields defined uniquely. If you prefer, then also, use only the "order by" portion of the clause and create a numeric cluster("PK" ID ) that is dynamic [rather than static].

Select rowno=Row_Number() over(Order by mychoiceofID,2ndchoice,etcchoice), * from mytable.

Jamie

Olga B SSCommitted Points: 1529 More actions · Answer 14

Excellent question. I learned something useful.

One application of OVER that seems to have potential (note hedging 😉 ) is that it allows applying aggregate function to more than one different grouping in the same query. For instance,

select customerID, productID, orderDate, orderAmount,

CustomerTotal = sum(orderAmount) over (partition by customerID),

ProductTotal = sum(orderAmount) over (partition by productID)

from Orders

Any thoughts on performance?

Edited for clarity.