SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Returning the Top X row for each group


Returning the Top X row for each group

Author
Message
david.soussan
david.soussan
Grasshopper
Grasshopper (19 reputation)Grasshopper (19 reputation)Grasshopper (19 reputation)Grasshopper (19 reputation)Grasshopper (19 reputation)Grasshopper (19 reputation)Grasshopper (19 reputation)Grasshopper (19 reputation)

Group: General Forum Members
Points: 19 Visits: 29
Dave Ballantyne (12/9/2010)
Tim Widdup (12/9/2010)
Ok , so how could you achieve this in SQL Server 2000? Again, just need the top x from each group ordered by the ID field.

Thanks


You will have to loop or cursor through the data. A "quirky update" may be possible, cant remember if possible in 2000 , sorry.


Or you can do it with a self-join: http://davidsoussan.co.uk/2009/10/10/how-to-sequence-entries-in-each-sub-set/


SELECT t1.userid, COUNT(t1.tableid) AS sequence, t1.tableid, t1.tableid >= t2.tableid AS flg
FROM table t1 INNER JOIN table t2 ON t1.userid = t2.userid
GROUP BY t1.userid, t1.tableid, flg
HAVING flg = TRUE



That method is platform-agnostic so it will work on SQL2000, Access, MySQL, etc. I'd be interested to hear how it compares in performance to the platform-specific row_number(), it runs very efficiently on MySQL and even on Access/JET provided that the join column is properly indexed. You can use any column for the join, or even multiple columns with a multi-colmun index, just so long as it results in a unique index within each subset. Add your condition using the count/sequence number to return the x number of records for each group. My blog shows how to vary the condition and one or two usefull variations and applications of the method.
Dave Ballantyne
Dave Ballantyne
SSCertifiable
SSCertifiable (7.1K reputation)SSCertifiable (7.1K reputation)SSCertifiable (7.1K reputation)SSCertifiable (7.1K reputation)SSCertifiable (7.1K reputation)SSCertifiable (7.1K reputation)SSCertifiable (7.1K reputation)SSCertifiable (7.1K reputation)

Group: General Forum Members
Points: 7134 Visits: 8370
Hi David ,

Pretty poorly. Your method also performs poorly in relation to a traditional sub-select.
This may or may not be platform agnostic, although an important point its not a concern for me personally. BTW your code does need mods to run in a SQLServer environment ;-)

See the attached image for a screen shot from profiler


drop table #testtab
go
create table #testtab
(
userid integer,
tableid integer
)
go
create unique clustered index idx1 on #testtab(userid,tableid)
go
with cter
as
(
select top(1000) row_number() over (order by (select null)) as r
from syscolumns a cross join syscolumns b cross join syscolumns c
)
insert into #testtab
Select r1.r,r2.r
from cter r1 cross join cter r2

select @@rowcount -- 1,000,000 rows
go
-- David Soussan Method
SELECT t1.userid, COUNT(t1.tableid) AS sequence, t1.tableid, case when t1.tableid >= t2.tableid then 1 else 0 end AS flg
FROM #testtab t1 INNER JOIN #testtab t2 ON t1.userid = t2.userid
GROUP BY t1.userid, t1.tableid, case when t1.tableid >= t2.tableid then 1 else 0 end
HAVING case when t1.tableid >= t2.tableid then 1 else 0 end = 1

go
-- count(*) sub select
select *,(Select count(*)
from #testtab t2
where t1.userid = t2.userid
and t2.TableId <= t1.tableId)
from #testtab t1
go
-- 2005 + , row_number()
Select * ,row_number() over (partition by userid order by tableid)
from #testtab





Clear Sky SQL
My Blog
Attachments
counts.jpg (42 views, 34.00 KB)
david.soussan
david.soussan
Grasshopper
Grasshopper (19 reputation)Grasshopper (19 reputation)Grasshopper (19 reputation)Grasshopper (19 reputation)Grasshopper (19 reputation)Grasshopper (19 reputation)Grasshopper (19 reputation)Grasshopper (19 reputation)

Group: General Forum Members
Points: 19 Visits: 29
Well Dave, I was hoping for some real benchmark data, but thanks anyway.

Of course the sql needs to be changed to the dialect specific to the platform - that is always a given. I'm not sure why you have to use CASE on sql server, but then I am a sql server newbie (not an sql newbie just a sql server newbie). Perhaps that is what makes it soooo slow and the syntax so ugly. On Access you have to use the HAVING clause for this to work but the boolean calculation is only a very small performance hit. On MySQL v 4 other problems arise, like the lack of table-type subquery support. v5 on the other hand would allow the sub-query select so that is just fine.

As I said, if all else fails then my method will work on any platform, just adjust the sql to the local dialect. If the platform has other tools then use them.
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (214K reputation)SSC Guru (214K reputation)SSC Guru (214K reputation)SSC Guru (214K reputation)SSC Guru (214K reputation)SSC Guru (214K reputation)SSC Guru (214K reputation)SSC Guru (214K reputation)

Group: General Forum Members
Points: 214116 Visits: 41979
david.soussan (12/10/2010)
Well Dave, I was hoping for some real benchmark data, but thanks anyway.


Setup SQL Profiler and you'll have it. ;-)

So far as true portability across platforms goes, it's a myth.

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (214K reputation)SSC Guru (214K reputation)SSC Guru (214K reputation)SSC Guru (214K reputation)SSC Guru (214K reputation)SSC Guru (214K reputation)SSC Guru (214K reputation)SSC Guru (214K reputation)

Group: General Forum Members
Points: 214116 Visits: 41979
BTW... the problem with the first two queries (after the test data setup) is that they both have full blown Triangular Joins in them.

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Dave Ballantyne
Dave Ballantyne
SSCertifiable
SSCertifiable (7.1K reputation)SSCertifiable (7.1K reputation)SSCertifiable (7.1K reputation)SSCertifiable (7.1K reputation)SSCertifiable (7.1K reputation)SSCertifiable (7.1K reputation)SSCertifiable (7.1K reputation)SSCertifiable (7.1K reputation)

Group: General Forum Members
Points: 7134 Visits: 8370
Jeff Moden (12/12/2010)

Setup SQL Profiler and you'll have it.;-)



Dave Ballantyne (12/10/2010)

See the attached image for a screen shot from profiler


;-)



Clear Sky SQL
My Blog
david.soussan
david.soussan
Grasshopper
Grasshopper (19 reputation)Grasshopper (19 reputation)Grasshopper (19 reputation)Grasshopper (19 reputation)Grasshopper (19 reputation)Grasshopper (19 reputation)Grasshopper (19 reputation)Grasshopper (19 reputation)

Group: General Forum Members
Points: 19 Visits: 29
Dave, whoops, looks like we both missed the attachment, thanks.
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (214K reputation)SSC Guru (214K reputation)SSC Guru (214K reputation)SSC Guru (214K reputation)SSC Guru (214K reputation)SSC Guru (214K reputation)SSC Guru (214K reputation)SSC Guru (214K reputation)

Group: General Forum Members
Points: 214116 Visits: 41979
Dave Ballantyne (12/12/2010)
Jeff Moden (12/12/2010)

Setup SQL Profiler and you'll have it.;-)



Dave Ballantyne (12/10/2010)

See the attached image for a screen shot from profiler


;-)


Yep... I definitely missed the attachment.

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Thinknook
Thinknook
Mr or Mrs. 500
Mr or Mrs. 500 (575 reputation)Mr or Mrs. 500 (575 reputation)Mr or Mrs. 500 (575 reputation)Mr or Mrs. 500 (575 reputation)Mr or Mrs. 500 (575 reputation)Mr or Mrs. 500 (575 reputation)Mr or Mrs. 500 (575 reputation)Mr or Mrs. 500 (575 reputation)

Group: General Forum Members
Points: 575 Visits: 164990
Madhivanan-208264 (12/6/2010)
More methods are available here
http://beyondrelational.com/blogs/madhivanan/archive/2008/09/12/return-top-n-rows.aspx


Awesome article Dave, and a great follow up (and hollistic) resource Old Hand. Thanks for sharing guys!

-----------------
... Then again, I could be totally wrong! Check the answer.
Check out posting guidelines here for faster more precise answers.

I believe in Codd
... and Thinknook is my Chamber of Understanding
ganotedp
ganotedp
SSC Journeyman
SSC Journeyman (79 reputation)SSC Journeyman (79 reputation)SSC Journeyman (79 reputation)SSC Journeyman (79 reputation)SSC Journeyman (79 reputation)SSC Journeyman (79 reputation)SSC Journeyman (79 reputation)SSC Journeyman (79 reputation)

Group: General Forum Members
Points: 79 Visits: 216
Why not DENSE_RANK instead of ROW_NUMBER?
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search