April 4, 2012 at 5:10 pm
I need to write a query that returns a list of students and their *final* grade. The table I'm querying contains multiple entries for each student depending on how many times that student attempted for a better grade. I want a list containing only the studenID and grade having the highest revision cycle (which means it's their final grade).
--DDL
create table grade_status (
statusID int
,revision_cycle int
,grade_assigned char(1)
,studentID int
,reason varchar (20)
)
insert into grade_status
values
(1, 1, 'C', 100, 'messy'),
(2, 2, 'B', 100, 'nice'),
(3, 3, 'A', 100, 'super'),
(4, 1, 'C', 235, 'messy'),
(5, 2, 'A', 235, 'super'),
(6, 1, 'C', 331, 'messy'),
(7, 2, 'C', 331, 'messy'),
(8, 3, 'C', 331, 'messy'),
(9, 4, 'B', 331, 'nice')
;
--current query
select studentID, grade_assigned
from grade_status
where revision_cycle = (select max(revision_cycle) from grade_status)
group by studentID, grade_assigned;
My current query returns only ONE studentID (331) and grade_assigned (B) because he completed the most revisions (4). The subquery I wrote in the where clause is useless.
I believe I need to loop through each respective studentIDs array of revisions in some subquery whose result will be used by the parent query, but how? CTE?
-
April 4, 2012 at 5:38 pm
Here's my attempt, probably can be shortened or better versions posted by the experts.
select row, studentid, revision_cycle, grade_assigned, reason
from
(
select row_number() over (partition by studentid order by revision_cycle desc) row,
studentid, grade_assigned, reason, revision_cycle from grade_status
) stus
where row = 1
Look up row_number() in books on line, what it does is number the rows in a particular order (the order by revision_cycle desc), and reset the row number counter according to a grouping (partition by student id), then you select the first row (due to the 'desc') keyword which represents the latest revision_cycle for each studentid.
Personally I like the old approach without row_number for this particular situation but I have to admit the new row_number() syntax is pretty cool.
April 4, 2012 at 5:50 pm
patrickmcginnis59 (4/4/2012)
Here's my attempt, probably can be shortened or better versions posted by the experts.
select row, studentid, revision_cycle, grade_assigned, reason
from
(
select row_number() over (partition by studentid order by revision_cycle desc) row,
studentid, grade_assigned, reason, revision_cycle from grade_status
) stus
where row = 1
Look up row_number() in books on line, what it does is number the rows in a particular order (the order by revision_cycle desc), and reset the row number counter according to a grouping (partition by student id), then you select the first row (due to the 'desc') keyword which represents the latest revision_cycle for each studentid.
Personally I like the old approach without row_number for this particular situation but I have to admit the new row_number() syntax is pretty cool.
close, if you dont mind having the revision_cycle number in the results i would do it in a single query.
Select MAX(revision_cycle) OVER (Partition BY studentID), studentID, grade_assigned FROM grade_status
if you only want the student id and grade i would do it like this
-- As Common table expression
WITH CTE AS (Select MAX(revision_cycle) OVER (Partition BY studentID) AS MaxID, studentID, grade_assigned FROM grade_status)
SELECT studentID, grade_assigned FROM CTE
--As a derived table
SELECT studentID, grade_assigned FROM (Select MAX(revision_cycle) OVER (Partition BY studentID) AS MaxID, studentID, grade_assigned FROM grade_status) x
EDIT: did not look close enough at my reasults.
EDIT 2:
Here are the correct results. man its to late right now.
--With a Common table expression
WITH CTE AS (
SELECT MAX(revision_cycle) AS MaxRevission, studentID FROM grade_status GROUP BY studentID)
SELECT g.grade_assigned, g.studentID From grade_status g
INNER JOIN CTE c
ON g.studentID = c.studentID
AND c.MaxRevission = g.revision_cycle
--With a derived table
SELECT g.grade_assigned, g.studentID From grade_status g
INNER JOIN (SELECT MAX(revision_cycle) AS MaxRevission, studentID FROM grade_status GROUP BY studentID) c
ON g.studentID = c.studentID
AND c.MaxRevission = g.revision_cycle
For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]
Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
Jeff Moden's Cross tab and Pivots Part 1[/url]
Jeff Moden's Cross tab and Pivots Part 2[/url]
April 4, 2012 at 6:04 pm
your rownumber over will work and may be faster than my join to a sub query. i would try both in your system and use which every is faster.
For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]
Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
Jeff Moden's Cross tab and Pivots Part 1[/url]
Jeff Moden's Cross tab and Pivots Part 2[/url]
April 4, 2012 at 6:12 pm
Well, your very first query/response works and I was going to go with that (apply to real life situation), but I'll make sure to look at the others too.
Thanks so much Patrick!
-
April 4, 2012 at 6:22 pm
xenophilia (4/4/2012)
Well, your very first query/response works and I was going to go with that (apply to real life situation), but I'll make sure to look at the others too.Thanks so much Patrick!
the strike through stuff is wrong and does not work. the queries in the EDIT 2 code block are exactly the same just written a little different. im moving where i do my MAX(revision_cycle) the top is in a common table expression the second in a (subquery, derived table, what every you want to call it as long as you understand it)
For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]
Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
Jeff Moden's Cross tab and Pivots Part 1[/url]
Jeff Moden's Cross tab and Pivots Part 2[/url]
April 5, 2012 at 12:15 pm
I hesitate going the CTE route for one reason. I dont' know if it can form the basis of an SSRS or Excel report. Can I use a query that uses CTE's as basis of an SSRS or Excel report?
-
April 5, 2012 at 1:46 pm
You sure can. Best to put it in a stored proc and call the proc from Excel or SSRS. Also, I just did a blog post on window functions (rank(), row_number(), etc.) that also references a great piece by Itzik Ben-Gan. Click on the SQLknowitall link in my sig.
Jared
CE - Microsoft
April 5, 2012 at 1:58 pm
Appreciate it and will do.
-
April 5, 2012 at 6:36 pm
SQLKnowItAll. sorry to get off track, but I tried Patrick's CTE and I am finding it much slower then the rowcount. At least in my dw where I have over 2 million records.
For readability I want to use CTEs but in the long run appears I may have a lot of rework to do if I don't stick with row_count solution.
Any advice on performance and CTEs?
-
April 5, 2012 at 6:43 pm
A quick look by using SET STATISTICS IO ON may give you some information. A more in depth look would be looking at the Actual Execution Plan. Can you post the results from either of those? I didn't take a deep look at the differences between the scripts, so I don't want to guess.
Jared
CE - Microsoft
April 6, 2012 at 3:05 pm
capn.hector or somebody, will you please assist me with CTEs some more? I'm having a terrible time of it. I want to select against the CTE.
Please show me how to build on top of the CTE and it's subsequent query yet another query which will return only those students who took biology (there should only be one student).
--First CTE
WITH CTE AS (
SELECT MAX(revision_cycle) AS MaxRevission, studentID
FROM grade_status
GROUP BY studentID)
SELECT g.grade_assigned, g.studentID From grade_status g
INNER JOIN CTE c
ON g.studentID = c.studentID
AND c.MaxRevission = g.revision_cycle
---DDL
create table grade_status (
statusID int
,revision_cycle int
,grade_assigned char(1)
,studentID int
,reason varchar (20)
,course varchar (20)
)
insert into grade_status
values
(1, 1, 'C', 100, 'messy', 'biology'),
(2, 2, 'B', 100, 'nice', 'biology'),
(3, 3, 'A', 100, 'super', 'biology'),
(4, 1, 'C', 235, 'messy', 'physics'),
(5, 2, 'A', 235, 'super', 'physics'),
(6, 1, 'C', 331, 'messy', 'chemistry'),
(7, 2, 'C', 331, 'messy', 'chemistry'),
(8, 3, 'C', 331, 'messy', 'chemistry'),
(9, 4, 'B', 331, 'nice', 'chemistry')
-
April 6, 2012 at 3:13 pm
Sorry, misread the SQL.
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
April 6, 2012 at 3:14 pm
you need to add a where clause to your query. see this link to read up on them
http://msdn.microsoft.com/en-us/library/ms188047.aspx
For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]
Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
Jeff Moden's Cross tab and Pivots Part 1[/url]
Jeff Moden's Cross tab and Pivots Part 2[/url]
April 6, 2012 at 3:15 pm
drew.allen (4/6/2012)
NM was a response to drew misreading the sql
For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]
Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
Jeff Moden's Cross tab and Pivots Part 1[/url]
Jeff Moden's Cross tab and Pivots Part 2[/url]
Viewing 15 posts - 1 through 15 (of 28 total)
You must be logged in to reply to this topic. Login to reply