Top N and group by query...

Question

Post reply

Top N and group by query...

swjohnson

Hall of Fame

Points: 3254
More actions
September 29, 2006 at 4:17 pm

#115869

Ok, my brain is fried and need some extra eyes and brains....
I have a table with three fields
EmailID (int and identity)
ProjID (int)
EmailStatus (int)
Ok, lets says that in the table are 3000 records for ProjID 1454, 50 records for ProjID 1455, and 2 records for ProjID 1456 (all have the same status =100)
What I need to do in a set based solution (I can accomplish it in a cursor or temp table format but want something faster...) is Select the Top 200 records for each ProjID where emailStatus=100.
I have been trying varying combinations of Top N and Group by to no avail. Basically the query should return 200 records for ProjID 1454, the 50 records for 1455 and the 2 records for 1456.
Again, I have code to do this via a cursor and temp tables but want to see about a set based solution. (We are on SQL 2000)
Any thoughts?
Thanks
SJ
Accepting all invites

Viewing 4 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply

Ram Ram Ten Centuries Points: 1079 More actions · Answer 1

Try this out

DECLARE @Rows INT

SELECT @Rows = 200

-- Replace @tbl with your actual tablename

SELECT A.*

FROM @tbl AS A

LEFT JOIN (

SELECT ProjId,

MIN(EmailId) AS MinId,

MAX(EmailId) AS MaxId,

COUNT(ProjId) AS RCnt

FROM @tbl

GROUP BY ProjId

&nbsp AS Derived

ON A.ProjId = Derived.ProjId

WHERE A.EmailId BETWEEN Derived.MinId AND Derived.MinId + @Rows-1

Ram

Andrew in WV SSCommitted Points: 1655 More actions · Answer 2

Ram's solution works as long as all values of EmailID for a particular project are contiguous. I think this will work regardless:

SELECT EmailID, ProjID, EmailStatus
FROM <TableName>
WHERE EmailID IN (
SELECT TOP 200 EmailID
FROM <TableName> IT
WHERE <TableName>.ProjID = IT.ProjID
  AND EmailStatus = 100
)
ORDER BY ProjID, EmailStatus, EmailID

--Andrew

swjohnson Hall of Fame Points: 3254 More actions · Answer 3

Thanks to both you and Ramamoorthy. After a break and some more tinkering I was able to come up with something similar to Andrew's query. However, both work for what I need but Ramamoorthy's is faster in my experiments and the execution plans are quite different. Especially when I use it on my large recordsets (over 100,000 records pending) and it is not dependent upon consecutive/contiguous EmailIDs.

Thanks again!

SJ

Accepting all invites