Top N and group by query...

• swjohnson

Hall of Fame

Points: 3254

Ok, my brain is fried and need some extra eyes and brains....

I have a table with three fields

EmailID (int and identity)

ProjID (int)

EmailStatus (int)

Ok, lets says that in the table are 3000 records for ProjID 1454, 50 records for ProjID 1455, and 2 records for ProjID 1456 (all have the same status =100)

What I need to do in a set based solution (I can accomplish it in a cursor or temp table format but want something faster...) is Select the Top 200 records for each ProjID where emailStatus=100.

I have been trying varying combinations of Top N and Group by to no avail.  Basically the query should return 200 records for ProjID 1454, the 50 records for 1455 and the 2 records for 1456.

Again, I have code to do this via a cursor and temp tables but want to see about a set based solution.  (We are on SQL 2000)

Any thoughts?

Thanks

SJ

• Ram Ram

Ten Centuries

Points: 1079

Try this out

DECLARE @Rows INT

SELECT  @Rows = 200

-- Replace @tbl with your actual tablename

SELECT A.*

FROM @tbl AS A

LEFT JOIN  (

SELECT  ProjId,

MIN(EmailId) AS MinId,

MAX(EmailId) AS MaxId,

COUNT(ProjId) AS RCnt

FROM @tbl

GROUP BY ProjId

&nbsp AS Derived

ON A.ProjId = Derived.ProjId

WHERE A.EmailId BETWEEN Derived.MinId AND Derived.MinId + @Rows-1

Ram

• Andrew in WV

SSCommitted

Points: 1655

Ram's solution works as long as all values of EmailID for a particular project are contiguous. I think this will work regardless:

```SELECT EmailID, ProjID, EmailStatus
FROM <TableName>
WHERE EmailID IN (
SELECT TOP 200 EmailID
FROM <TableName> IT
WHERE <TableName>.ProjID = IT.ProjID
AND EmailStatus = 100
)
ORDER BY ProjID, EmailStatus, EmailID
```

--Andrew

• swjohnson

Hall of Fame

Points: 3254

Thanks to both you and Ramamoorthy.  After a break and some more tinkering I was able to come up with something similar to Andrew's query.  However, both work for what I need but Ramamoorthy's is faster in my experiments and the execution plans are quite different.  Especially when I use it on my large recordsets (over 100,000 records pending) and it is not dependent upon consecutive/contiguous EmailIDs.

Thanks again!

SJ

Viewing 4 posts - 1 through 4 (of 4 total)