Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase «««1234»»

Returning the Top X row for each group Expand / Collapse
Author
Message
Posted Friday, December 10, 2010 2:21 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Wednesday, May 16, 2012 6:29 PM
Points: 7, Visits: 24
Dave Ballantyne (12/9/2010)
Tim Widdup (12/9/2010)
Ok , so how could you achieve this in SQL Server 2000? Again, just need the top x from each group ordered by the ID field.

Thanks


You will have to loop or cursor through the data. A "quirky update" may be possible, cant remember if possible in 2000 , sorry.


Or you can do it with a self-join: http://davidsoussan.co.uk/2009/10/10/how-to-sequence-entries-in-each-sub-set/

SELECT t1.userid, COUNT(t1.tableid) AS sequence, t1.tableid, t1.tableid >= t2.tableid AS flg
FROM table t1 INNER JOIN table t2 ON t1.userid = t2.userid
GROUP BY t1.userid, t1.tableid, flg
HAVING flg = TRUE

That method is platform-agnostic so it will work on SQL2000, Access, MySQL, etc. I'd be interested to hear how it compares in performance to the platform-specific row_number(), it runs very efficiently on MySQL and even on Access/JET provided that the join column is properly indexed. You can use any column for the join, or even multiple columns with a multi-colmun index, just so long as it results in a unique index within each subset. Add your condition using the count/sequence number to return the x number of records for each group. My blog shows how to vary the condition and one or two usefull variations and applications of the method.
Post #1032884
Posted Friday, December 10, 2010 3:39 AM


SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Yesterday @ 4:50 AM
Points: 1,949, Visits: 8,315
Hi David ,

Pretty poorly. Your method also performs poorly in relation to a traditional sub-select.
This may or may not be platform agnostic, although an important point its not a concern for me personally. BTW your code does need mods to run in a SQLServer environment

See the attached image for a screen shot from profiler

drop table #testtab
go
create table #testtab
(
userid integer,
tableid integer
)
go
create unique clustered index idx1 on #testtab(userid,tableid)
go
with cter
as
(
select top(1000) row_number() over (order by (select null)) as r
from syscolumns a cross join syscolumns b cross join syscolumns c
)
insert into #testtab
Select r1.r,r2.r
from cter r1 cross join cter r2

select @@rowcount -- 1,000,000 rows
go
-- David Soussan Method
SELECT t1.userid, COUNT(t1.tableid) AS sequence, t1.tableid, case when t1.tableid >= t2.tableid then 1 else 0 end AS flg
FROM #testtab t1 INNER JOIN #testtab t2 ON t1.userid = t2.userid
GROUP BY t1.userid, t1.tableid, case when t1.tableid >= t2.tableid then 1 else 0 end
HAVING case when t1.tableid >= t2.tableid then 1 else 0 end = 1

go
-- count(*) sub select
select *,(Select count(*)
from #testtab t2
where t1.userid = t2.userid
and t2.TableId <= t1.tableId)
from #testtab t1
go
-- 2005 + , row_number()
Select * ,row_number() over (partition by userid order by tableid)
from #testtab





Clear Sky SQL
My Blog
Kent user group


  Post Attachments 
counts.jpg (32 views, 34.85 KB)
Post #1032913
Posted Friday, December 10, 2010 4:39 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Wednesday, May 16, 2012 6:29 PM
Points: 7, Visits: 24
Well Dave, I was hoping for some real benchmark data, but thanks anyway.

Of course the sql needs to be changed to the dialect specific to the platform - that is always a given. I'm not sure why you have to use CASE on sql server, but then I am a sql server newbie (not an sql newbie just a sql server newbie). Perhaps that is what makes it soooo slow and the syntax so ugly. On Access you have to use the HAVING clause for this to work but the boolean calculation is only a very small performance hit. On MySQL v 4 other problems arise, like the lack of table-type subquery support. v5 on the other hand would allow the sub-query select so that is just fine.

As I said, if all else fails then my method will work on any platform, just adjust the sql to the local dialect. If the platform has other tools then use them.
Post #1032940
Posted Sunday, December 12, 2010 7:45 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 3:28 PM
Points: 36,995, Visits: 31,521
david.soussan (12/10/2010)
Well Dave, I was hoping for some real benchmark data, but thanks anyway.


Setup SQL Profiler and you'll have it.

So far as true portability across platforms goes, it's a myth.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1033552
Posted Sunday, December 12, 2010 8:10 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 3:28 PM
Points: 36,995, Visits: 31,521
BTW... the problem with the first two queries (after the test data setup) is that they both have full blown Triangular Joins in them.

--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1033556
Posted Sunday, December 12, 2010 11:43 PM


SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Yesterday @ 4:50 AM
Points: 1,949, Visits: 8,315
Jeff Moden (12/12/2010)

Setup SQL Profiler and you'll have it.



Dave Ballantyne (12/10/2010)

See the attached image for a screen shot from profiler






Clear Sky SQL
My Blog
Kent user group
Post #1033598
Posted Monday, December 13, 2010 6:50 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Wednesday, May 16, 2012 6:29 PM
Points: 7, Visits: 24
Dave, whoops, looks like we both missed the attachment, thanks.
Post #1033766
Posted Friday, December 31, 2010 5:42 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 3:28 PM
Points: 36,995, Visits: 31,521
Dave Ballantyne (12/12/2010)
Jeff Moden (12/12/2010)

Setup SQL Profiler and you'll have it.



Dave Ballantyne (12/10/2010)

See the attached image for a screen shot from profiler




Yep... I definitely missed the attachment.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1041497
Posted Friday, November 23, 2012 3:25 AM


SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Thursday, August 28, 2014 1:50 PM
Points: 91, Visits: 164,496
Madhivanan-208264 (12/6/2010)
More methods are available here
http://beyondrelational.com/blogs/madhivanan/archive/2008/09/12/return-top-n-rows.aspx


Awesome article Dave, and a great follow up (and hollistic) resource Old Hand. Thanks for sharing guys!


-----------------
... Then again, I could be totally wrong! Check the answer.
Check out posting guidelines here for faster more precise answers.

I believe in Codd
... and Thinknook is my Chamber of Understanding
Post #1388097
Posted Friday, November 23, 2012 5:32 AM


Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Sunday, May 19, 2013 8:22 AM
Points: 17, Visits: 148
Why not DENSE_RANK instead of ROW_NUMBER?
Post #1388142
« Prev Topic | Next Topic »

Add to briefcase «««1234»»

Permissions Expand / Collapse