Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««123»»

Delete Duplicates Expand / Collapse
Author
Message
Posted Tuesday, May 13, 2008 7:37 AM


SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Thursday, June 26, 2008 7:27 AM
Points: 239, Visits: 165
I agree with the others, great question!

Q

Please take a number. Now serving emergency 1,203,894

Post #499593
Posted Tuesday, May 13, 2008 10:07 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Thursday, October 24, 2013 12:45 PM
Points: 123, Visits: 132
This anwer works for the given data but wouldn't work if we inserted a row like:

insert into #new(id,keyvalue) values (1,'bb')

The stated answer assumes a duplicate is defined by the ID column and would delete the above row even thought it's unique in the table. Since no PK is defined on the table I think we must assume a duplicate is defined by multiple rows with identical values for every column. So if we include both ID and keyvalue columns in the over() clause we will delete only non-unique rows:

with numbered as
(
SELECT rowno=row_number() over (partition by ID, keyvalue order by ID, keyvalue),ID,keyvalue
from #new
)

delete from numbered where rowno>1
Post #499791
Posted Tuesday, May 13, 2008 11:03 AM
Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Tuesday, July 8, 2014 12:24 PM
Points: 3,475, Visits: 579
J is correct.
As for SQL, the old one will work too for this particular case:

declare @I int
select @I = (select count(convert(varchar(2),ID)+Keyvalue) from #new
group by (convert(varchar(2),ID)+Keyvalue)
having count(convert(varchar(2),ID)+Keyvalue) >1)
set @I = @I-1
set rowcount @I
delete #New where convert(varchar(2),ID)+Keyvalue
in
(select convert(varchar(2),ID)+Keyvalue from #new
group by (convert(varchar(2),ID)+Keyvalue)
having count(convert(varchar(2),ID)+Keyvalue) >1)



Regards,
Yelena Varshal

Post #499834
Posted Tuesday, May 13, 2008 11:05 AM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Saturday, June 28, 2014 8:50 AM
Points: 2,649, Visits: 766
So if we include both ID and keyvalue columns in the over() clause we will delete only non-unique rows:


The angle you derive with the "Partition by" clause is literally a "partition" of a particular cluster of fields defined uniquely. If you prefer, then also, use only the "order by" portion of the clause and create a numeric cluster("PK" ID ) that is dynamic [rather than static].


Select rowno=Row_Number() over(Order by mychoiceofID,2ndchoice,etcchoice), * from mytable.


Jamie
Post #499838
Posted Tuesday, May 13, 2008 12:06 PM
Mr or Mrs. 500

Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500

Group: General Forum Members
Last Login: Tuesday, December 10, 2013 4:26 PM
Points: 582, Visits: 451
Excellent question. I learned something useful.

One application of OVER that seems to have potential (note hedging ;) ) is that it allows applying aggregate function to more than one different grouping in the same query. For instance,

select customerID, productID, orderDate, orderAmount,
CustomerTotal = sum(orderAmount) over (partition by customerID),
ProductTotal = sum(orderAmount) over (partition by productID)
from Orders


Any thoughts on performance?

Edited for clarity.
Post #499882
Posted Tuesday, May 13, 2008 12:53 PM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Saturday, June 28, 2014 8:50 AM
Points: 2,649, Visits: 766
Any thoughts on performance?


If I recall correctly I came across this syntax with help from Erland Sommarskog. I was looking for a way to remove duplicates from a table with over 100 million rows. As it turned out, this was the only method I could use that would perform well enough given the amount of cpu and ram, that would even do the job.



Jamie
Post #499910
Posted Wednesday, May 14, 2008 12:11 AM
SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Thursday, July 17, 2014 10:36 PM
Points: 5,303, Visits: 1,378
Excellent questions..............


Post #500191
Posted Wednesday, May 14, 2008 6:20 AM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Saturday, June 28, 2014 8:50 AM
Points: 2,649, Visits: 766
Of note, I learned something today as well regarding the semi-colons as my background is vb where semi-colons make our life more difficult. I am focussing on getting those semi-colons into my queries. Thanks for all the nice comments... appreciated!

Jamie
Post #500377
Posted Thursday, May 15, 2008 8:01 AM
Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Thursday, July 24, 2014 7:30 AM
Points: 3,871, Visits: 3,621
Yes, this is useful. In fact I ran into a case yesterday where I was able to employ this method to remove duplicates from a table. And as performance goes it was much more efficient then the method I would commonly have used which was copy all the data out into a temp table, truncate and then select distinct back into the table.
Post #501332
Posted Thursday, May 22, 2008 11:08 AM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Friday, December 6, 2013 2:42 PM
Points: 58, Visits: 266
If you have code like this or need to use code like this, you have other greater issues going on and need to put the pipe down.
Post #505338
« Prev Topic | Next Topic »

Add to briefcase ««123»»

Permissions Expand / Collapse