cumulative average without cursor

  • I am trying to calculate cumulative GPA which is currently it is being done via a cursor and i was looking to change it to set based if possible.

    The cumulative GPA is basically the average marks of all courses in a term and all previous terms. Where I am having an issue is if the course was taken in a previous term and the mark was lower, then the lower mark in not factored in to the GPA. If a course was taken in a previous term and the mark was higher then the marks for the current term are not factored into the GPA.

    In the example below, math is taken in all three terms and the current term's mark is included in the GPA but not the previous terms because current terms mark is higher. Two classes (bio,chem) are taken in third term but not included due to marks being higher in second term. Any suggestions would be much appreciated. thanks

    CREATE TABLE [gpa]([term] int,

    [course] varchar(10),

    [mark] float);

    INSERT INTO [gpa]

    VALUES (1,'math',3),

    (1,'eng',4),

    (1,'hist',4.5),

    (1,'anth',4.5),

    (2,'comp',3),

    (2,'math',4),

    (2,'bio',4.5),

    (2,'chem',4.5),

    (3,'soc',4.5),

    (3,'math',4.5),

    (3,'bio',4),

    (3,'chem',4);

    DECLARE @term int;

    DECLARE c_gpa CURSOR

    FOR

    SELECT DISTINCT [term]

    FROM [gpa]

    ORDER BY [term];

    OPEN c_gpa;

    WHILE 1 = 1

    BEGIN

    FETCH NEXT FROM c_gpa INTO @term;

    IF @@FETCH_STATUS < 0

    BEGIN

    BREAK

    END;

    UPDATE [a]

    SET [mark] = 0

    FROM [gpa] [a]

    INNER JOIN [gpa]

    ON [a].[term] < .[term]

    AND [a].[course] = .[course]

    AND .[term] = @term

    WHERE .[mark] > [a].[mark]

    AND [a].[term] < @term;

    UPDATE [a]

    SET [mark] = 0

    FROM [gpa] [a]

    INNER JOIN [gpa]

    ON [a].[term] > .[term]

    AND [a].[course] = .[course]

    WHERE .[mark] > [a].[mark]

    AND [a].[term] = @term;

    --SELECT *

    --FROM [gpa]

    --WHERE [term] <= @term;

    SELECT @term as [term],

    SUM([mark]) / COUNT([mark]) as [cumulativeGPA]

    FROM [gpa]

    WHERE [term] <= @term

    AND [mark] <> 0;

    END;

    CLOSE c_gpa;

    DROP TABLE [gpa];

    Bob
    -----------------------------------------------------------------------------
    How to post to get the best help[/url]

  • Robert klimes (7/29/2015)


    I am trying to calculate cumulative GPA which is currently it is being done via a cursor and i was looking to change it to set based if possible.

    The cumulative GPA is basically the average marks of all courses in a term and all previous terms. Where I am having an issue is if the course was taken in a previous term and the mark was lower, then the lower mark in not factored in to the GPA. If a course was taken in a previous term and the mark was higher then the marks for the current term are not factored into the GPA.

    In the example below, math is taken in all three terms and the current term's mark is included in the GPA but not the previous terms because current terms mark is higher. Two classes (bio,chem) are taken in third term but not included due to marks being higher in second term. Any suggestions would be much appreciated. thanks

    CREATE TABLE [gpa]([term] int,

    [course] varchar(10),

    [mark] float);

    INSERT INTO [gpa]

    VALUES (1,'math',3),

    (1,'eng',4),

    (1,'hist',4.5),

    (1,'anth',4.5),

    (2,'comp',3),

    (2,'math',4),

    (2,'bio',4.5),

    (2,'chem',4.5),

    (3,'soc',4.5),

    (3,'math',4.5),

    (3,'bio',4),

    (3,'chem',4);

    DECLARE @term int;

    DECLARE c_gpa CURSOR

    FOR

    SELECT DISTINCT [term]

    FROM [gpa]

    ORDER BY [term];

    OPEN c_gpa;

    WHILE 1 = 1

    BEGIN

    FETCH NEXT FROM c_gpa INTO @term;

    IF @@FETCH_STATUS < 0

    BEGIN

    BREAK

    END;

    UPDATE [a]

    SET [mark] = 0

    FROM [gpa] [a]

    INNER JOIN [gpa]

    ON [a].[term] < .[term]

    AND [a].[course] = .[course]

    AND .[term] = @term

    WHERE .[mark] > [a].[mark]

    AND [a].[term] < @term;

    UPDATE [a]

    SET [mark] = 0

    FROM [gpa] [a]

    INNER JOIN [gpa]

    ON [a].[term] > .[term]

    AND [a].[course] = .[course]

    WHERE .[mark] > [a].[mark]

    AND [a].[term] = @term;

    --SELECT *

    --FROM [gpa]

    --WHERE [term] <= @term;

    SELECT @term as [term],

    SUM([mark]) / COUNT([mark]) as [cumulativeGPA]

    FROM [gpa]

    WHERE [term] <= @term

    AND [mark] <> 0;

    END;

    CLOSE c_gpa;

    DROP TABLE [gpa];

    Have you tried LAG? This is just a minor twist on a typical running total which is easily solved with LAG. That of course is assuming you are on 2012.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Based on the stated requirements... I think this is what you're looking for...

    IF OBJECT_ID('tempdb..#temp') IS NOT NULL

    DROP TABLE #temp;

    CREATE TABLE #temp([term] int,

    [course] varchar(10),

    [mark] float);

    INSERT INTO #temp (term,course,mark) VALUES

    (1,'math',3),

    (1,'eng',4),

    (1,'hist',4.5),

    (1,'anth',4.5),

    (2,'comp',3),

    (2,'math',4),

    (2,'bio',4.5),

    (2,'chem',4.5),

    (3,'soc',4.5),

    (3,'math',4.5),

    (3,'bio',4),

    (3,'chem',4);

    WITH NewMark AS (

    SELECT

    t.term,

    CASE WHEN t.mark = MAX(t.mark) OVER (PARTITION BY t.course ORDER BY t.course, t.term) THEN t.mark END AS NewMark

    FROM

    #temp t

    ), AvgPrep AS (

    SELECT

    nm.term,

    COUNT(nm.NewMark) * 1.0 AS CourseCount,

    SUM(nm.NewMark) * 1.0 MarkSum

    FROM

    NewMark nm

    GROUP BY

    nm.term

    )

    SELECT

    ap.term,

    SUM(ap.MarkSum) OVER (ORDER BY ap.term) / SUM(ap.CourseCount) OVER (ORDER BY ap.term) AS RunningAvg

    FROM

    AvgPrep ap

    Here are the results...

    term RunningAvg

    ----------- ----------------------

    1 4

    2 4

    3 4.1

  • I am on the last day of a cruise vacation and pretty much brain-dead, but I think this does the trick, although I am not certain you want to use the MAX score for each class in all term averages. I am pretty sure there is a simpler solution too

    This also adds in a StudenID, so future solutions can be sure to take that into account, which I am certain the full solution will need to do. It also uses tempdb so we don't create permanent objects in other peoples databases. 🙂

    USE tempdb

    GO

    CREATE TABLE [gpa](

    StudentID INT,

    [term] int,

    [course] varchar(10),

    [mark] float);

    INSERT INTO [gpa]

    VALUES (1,1,'math',3),

    (1,1,'eng',4),

    (1,1,'hist',4.5),

    (1,1,'anth',4.5),

    (1,2,'comp',3),

    (1,2,'math',4),

    (1,2,'bio',4.5),

    (1,2,'chem',4.5),

    (1,3,'soc',4.5),

    (1,3,'math',4.5),

    (1,3,'bio',4),

    (1,3,'chem',4),

    (2,1,'math',3),

    (2,1,'eng',4),

    (2,1,'hist',4.5),

    (2,1,'anth',4.5),

    (2,2,'comp',3),

    (2,2,'math',4),

    (2,2,'bio',4.5),

    (2,2,'chem',4.5),

    (2,3,'soc',4.5),

    (2,3,'math',4.5),

    (2,3,'bio',4),

    (2,3,'chem',4);

    ;WITH a AS (

    SELECT StudentID, term, course,

    MAX(mark) OVER (PARTITION BY StudentID, course ORDER BY StudentID, course

    ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS topScore

    FROM dbo.gpa),

    b AS (

    SELECT StudentID, term,

    COUNT(*) OVER (PARTITION BY StudentID ORDER BY StudentID, term

    ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS runningCount,

    SUM(topScore) OVER (PARTITION BY StudentID ORDER BY StudentID, term

    ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS runningSum

    FROM a),

    c AS (

    SELECT StudentID, term, runningCount, runningSum,

    ROW_NUMBER() OVER (PARTITION BY StudentID, term ORDER BY runningCount DESC) AS rownum

    FROM b)

    SELECT StudentID, term, runningSum/(1.0*c.runningCount)

    FROM c

    WHERE rownum = 1

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Have you tried LAG? This is just a minor twist on a typical running total which is easily solved with LAG. That of course is assuming you are on 2012.

    My typical running total would be something like query below and not sure how to incorporate Lag with this, I'll play around and see what i can figure out.

    select term, course, mark, avg(mark) over (order by term rows between unbounded preceding and current row) as cumulativeGPA

    from gpa

    order by term

    Bob
    -----------------------------------------------------------------------------
    How to post to get the best help[/url]

  • Jason A. Long (7/29/2015)


    Based on the stated requirements... I think this is what you're looking for...

    IF OBJECT_ID('tempdb..#temp') IS NOT NULL

    DROP TABLE #temp;

    CREATE TABLE #temp([term] int,

    [course] varchar(10),

    [mark] float);

    INSERT INTO #temp (term,course,mark) VALUES

    (1,'math',3),

    (1,'eng',4),

    (1,'hist',4.5),

    (1,'anth',4.5),

    (2,'comp',3),

    (2,'math',4),

    (2,'bio',4.5),

    (2,'chem',4.5),

    (3,'soc',4.5),

    (3,'math',4.5),

    (3,'bio',4),

    (3,'chem',4);

    WITH NewMark AS (

    SELECT

    t.term,

    CASE WHEN t.mark = MAX(t.mark) OVER (PARTITION BY t.course ORDER BY t.course, t.term) THEN t.mark END AS NewMark

    FROM

    #temp t

    ), AvgPrep AS (

    SELECT

    nm.term,

    COUNT(nm.NewMark) * 1.0 AS CourseCount,

    SUM(nm.NewMark) * 1.0 MarkSum

    FROM

    NewMark nm

    GROUP BY

    nm.term

    )

    SELECT

    ap.term,

    SUM(ap.MarkSum) OVER (ORDER BY ap.term) / SUM(ap.CourseCount) OVER (ORDER BY ap.term) AS RunningAvg

    FROM

    AvgPrep ap

    Here are the results...

    term RunningAvg

    ----------- ----------------------

    1 4

    2 4

    3 4.1

    The results aren't quite right(terms 2 and 3 are low) but I will have a look to see if I can tweak this.

    Bob
    -----------------------------------------------------------------------------
    How to post to get the best help[/url]

  • TheSQLGuru (7/29/2015)


    I am on the last day of a cruise vacation and pretty much brain-dead, but I think this does the trick, although I am not certain you want to use the MAX score for each class in all term averages. I am pretty sure there is a simpler solution too

    This also adds in a StudenID, so future solutions can be sure to take that into account, which I am certain the full solution will need to do. It also uses tempdb so we don't create permanent objects in other peoples databases. 🙂

    USE tempdb

    GO

    CREATE TABLE [gpa](

    StudentID INT,

    [term] int,

    [course] varchar(10),

    [mark] float);

    INSERT INTO [gpa]

    VALUES (1,1,'math',3),

    (1,1,'eng',4),

    (1,1,'hist',4.5),

    (1,1,'anth',4.5),

    (1,2,'comp',3),

    (1,2,'math',4),

    (1,2,'bio',4.5),

    (1,2,'chem',4.5),

    (1,3,'soc',4.5),

    (1,3,'math',4.5),

    (1,3,'bio',4),

    (1,3,'chem',4),

    (2,1,'math',3),

    (2,1,'eng',4),

    (2,1,'hist',4.5),

    (2,1,'anth',4.5),

    (2,2,'comp',3),

    (2,2,'math',4),

    (2,2,'bio',4.5),

    (2,2,'chem',4.5),

    (2,3,'soc',4.5),

    (2,3,'math',4.5),

    (2,3,'bio',4),

    (2,3,'chem',4);

    ;WITH a AS (

    SELECT StudentID, term, course,

    MAX(mark) OVER (PARTITION BY StudentID, course ORDER BY StudentID, course

    ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS topScore

    FROM dbo.gpa),

    b AS (

    SELECT StudentID, term,

    COUNT(*) OVER (PARTITION BY StudentID ORDER BY StudentID, term

    ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS runningCount,

    SUM(topScore) OVER (PARTITION BY StudentID ORDER BY StudentID, term

    ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS runningSum

    FROM a),

    c AS (

    SELECT StudentID, term, runningCount, runningSum,

    ROW_NUMBER() OVER (PARTITION BY StudentID, term ORDER BY runningCount DESC) AS rownum

    FROM b)

    SELECT StudentID, term, runningSum/(1.0*c.runningCount)

    FROM c

    WHERE rownum = 1

    If you're on Vacation you shouldn't be answering forums posts:-P. You're absolutely correct that I would need studentID, I was just trying to make example as simple as possible. This solution unfortunately doesn't give desired results(high on all terms) but i will have a look to see if it can be tweaked.

    Bob
    -----------------------------------------------------------------------------
    How to post to get the best help[/url]

  • This should give the right results, but it strikes me as rather inelegant. Sometimes all I can find is a hammer 🙂

    CREATE TABLE [gpa]([term] int,

    [course] varchar(10),

    [mark] float);

    INSERT INTO [gpa]

    VALUES (1,'math',3),

    (1,'eng',4),

    (1,'hist',4.5),

    (1,'anth',4.5),

    (2,'comp',3),

    (2,'math',4),

    (2,'bio',4.5),

    (2,'chem',4.5),

    (3,'soc',4.5),

    (3,'math',4.5),

    (3,'bio',4),

    (3,'chem',4);

    WITH CTE AS (

    SELECT course, MAX(mark) AS HighestScoreSoFar, CuTerm

    FROM gpa INNER JOIN (SELECT DISTINCT term from gpa) x (CuTerm)

    ON gpa.term<=x.Cuterm

    GROUP BY CuTerm, course)

    SELECT AVG(HighestScoreSoFar), CuTerm FROM CTE

    GROUP BY CuTerm

    I feel like there should be a better way than this triangular join silliness, and hopefully someone better at this than I am will find it! Then again, there may not be, given the nature of the problem. I suppose we'll find out 🙂

    Cheers!

  • Jacob Wilkins (7/29/2015)


    This should give the right results, but it strikes me as rather inelegant. Sometimes all I can find is a hammer 🙂

    CREATE TABLE [gpa]([term] int,

    [course] varchar(10),

    [mark] float);

    INSERT INTO [gpa]

    VALUES (1,'math',3),

    (1,'eng',4),

    (1,'hist',4.5),

    (1,'anth',4.5),

    (2,'comp',3),

    (2,'math',4),

    (2,'bio',4.5),

    (2,'chem',4.5),

    (3,'soc',4.5),

    (3,'math',4.5),

    (3,'bio',4),

    (3,'chem',4);

    WITH CTE AS (

    SELECT course, MAX(mark) AS HighestScoreSoFar, CuTerm

    FROM gpa INNER JOIN (SELECT DISTINCT term from gpa) x (CuTerm)

    ON gpa.term<=x.Cuterm

    GROUP BY CuTerm, course)

    SELECT AVG(HighestScoreSoFar), CuTerm FROM CTE

    GROUP BY CuTerm

    I feel like there should be a better way than this triangular join silliness, and hopefully someone better at this than I am will find it! Then again, there may not be, given the nature of the problem. I suppose we'll find out 🙂

    Cheers!

    Thanks Jacob, this does give the correct results. now to see if I can incorporate this logic into my actual query.

    Bob
    -----------------------------------------------------------------------------
    How to post to get the best help[/url]

  • Robert klimes (7/29/2015)


    The results aren't quite right(terms 2 and 3 are low) but I will have a look to see if I can tweak this.

    I read "Cumulative Average" to mean a running average...

    So...

    term 1 is 3, 4, 4.5, 4.5 = avg of 4

    term 2 is 3, 4, 4.5, 4.5 & 3, 4, 4.5, 4.5 = avg of 4

    term 3 is 3, 4, 4.5, 4.5 & 3, 4, 4.5, 4.5 & 4.5, 4.5 = avg of 4.1

    If that's not the case and each term gets it's own discrete average, you can use the following...

    IF OBJECT_ID('tempdb..#temp') IS NOT NULL

    DROP TABLE #temp;

    CREATE TABLE #temp([term] int,

    [course] varchar(10),

    [mark] float);

    INSERT INTO #temp (term,course,mark) VALUES

    (1,'math',3),

    (1,'eng',4),

    (1,'hist',4.5),

    (1,'anth',4.5),

    (2,'comp',3),

    (2,'math',4),

    (2,'bio',4.5),

    (2,'chem',4.5),

    (3,'soc',4.5),

    (3,'math',4.5),

    (3,'bio',4),

    (3,'chem',4);

    WITH NewMark AS (

    SELECT

    t.term,

    CASE WHEN t.mark = MAX(t.mark) OVER (PARTITION BY t.course ORDER BY t.course, t.term) THEN t.mark END AS NewMark

    FROM

    #temp t

    )

    SELECT

    nm.term,

    AVG(nm.NewMark) AS AvgMark

    FROM

    NewMark nm

    GROUP BY

    nm.term

    Results...

    term AvgMark

    ----------- ----------------------

    1 4

    2 4

    3 4.5

  • Robert klimes (7/29/2015)


    I am trying to calculate cumulative GPA which is currently it is being done via a cursor and i was looking to change it to set based if possible.

    The cumulative GPA is basically the average marks of all courses in a term and all previous terms. Where I am having an issue is if the course was taken in a previous term and the mark was lower, then the lower mark in not factored in to the GPA. If a course was taken in a previous term and the mark was higher then the marks for the current term are not factored into the GPA.

    In the example below, math is taken in all three terms and the current term's mark is included in the GPA but not the previous terms because current terms mark is higher. Two classes (bio,chem) are taken in third term but not included due to marks being higher in second term. Any suggestions would be much appreciated. thanks

    CREATE TABLE [gpa]([term] int,

    [course] varchar(10),

    [mark] float);

    INSERT INTO [gpa]

    VALUES (1,'math',3),

    (1,'eng',4),

    (1,'hist',4.5),

    (1,'anth',4.5),

    (2,'comp',3),

    (2,'math',4),

    (2,'bio',4.5),

    (2,'chem',4.5),

    (3,'soc',4.5),

    (3,'math',4.5),

    (3,'bio',4),

    (3,'chem',4);

    DECLARE @term int;

    DECLARE c_gpa CURSOR

    FOR

    SELECT DISTINCT [term]

    FROM [gpa]

    ORDER BY [term];

    OPEN c_gpa;

    WHILE 1 = 1

    BEGIN

    FETCH NEXT FROM c_gpa INTO @term;

    IF @@FETCH_STATUS < 0

    BEGIN

    BREAK

    END;

    UPDATE [a]

    SET [mark] = 0

    FROM [gpa] [a]

    INNER JOIN [gpa]

    ON [a].[term] < .[term]

    AND [a].[course] = .[course]

    AND .[term] = @term

    WHERE .[mark] > [a].[mark]

    AND [a].[term] < @term;

    UPDATE [a]

    SET [mark] = 0

    FROM [gpa] [a]

    INNER JOIN [gpa]

    ON [a].[term] > .[term]

    AND [a].[course] = .[course]

    WHERE .[mark] > [a].[mark]

    AND [a].[term] = @term;

    --SELECT *

    --FROM [gpa]

    --WHERE [term] <= @term;

    SELECT @term as [term],

    SUM([mark]) / COUNT([mark]) as [cumulativeGPA]

    FROM [gpa]

    WHERE [term] <= @term

    AND [mark] <> 0;

    END;

    CLOSE c_gpa;

    DROP TABLE [gpa];

    First, is the right answer 4.3333 ? I changed the data type in your table to decimal(5, 4), as the float data type would be unable to accurately represent any mark value that ended in point 1 (e.g. 4.1 or 3.1, etc.). If I read your post correctly, you appear to just need the MAX value of a mark in a given course for a given student and course across ALL terms, and then just average all those maximum marks, as there will be exactly 1 per course, regardless of the number of terms. If I have that right, then here's the simple quiery:

    CREATE TABLE #gpa (

    [student] int,

    [term] int,

    [course] varchar(10),

    [mark] decimal(5,4)

    );

    INSERT INTO #gpa (student, term, course, mark)

    VALUES (1,1,'math',3),

    (1,1,'eng',4),

    (1,1,'hist',4.5),

    (1,1,'anth',4.5),

    (1,2,'comp',3),

    (1,2,'math',4),

    (1,2,'bio',4.5),

    (1,2,'chem',4.5),

    (1,3,'soc',4.5),

    (1,3,'math',4.5),

    (1,3,'bio',4),

    (1,3,'chem',4);

    WITH MAX_MARKS AS (

    SELECT G.student, G.course, MAX(G.mark) AS MAX_MARK

    FROM #gpa AS G

    GROUP BY G.student, G.course

    )

    SELECT G.student, MIN(G.term) AS STARTING_TERM, MAX(G.term) AS ENDING_TERM, AVG(M.MAX_MARK) AS AVERAGE_GPA,

    STUFF(

    (SELECT DISTINCT ', ' + CT.course

    FROM #gpa AS CT

    WHERE CT.student = G.student

    ORDER BY 1

    FOR XML PATH(''), TYPE).value('.', N'varchar(max)')

    , 1, 2, '') AS CourseList

    FROM #gpa AS G

    INNER JOIN MAX_MARKS AS M

    ON G.student = M.student

    AND G.course = M.course

    GROUP BY G.student

    DROP TABLE #gpa

    Here are the results:

    student STARTING_TERM ENDING_TERM AVERAGE_GPA CourseList

    ----------- ------------- ----------- --------------------------------------- --------------------------------------------------

    1 1 3 4.333333 anth, bio, chem, comp, eng, hist, math, soc

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • It is a cumulative average, but it has a wrinkle that makes it more expensive to calculate.

    For term 1, you average all the marks.

    For subsequent terms, you average all the marks received up to and including that term, with the caveat that if a course has been taken multiple times in the terms in question, only the occurrence with the highest mark counts.

    So, the math course taken in term 1 counts towards term 1's average, but does not count towards term 2's average, because now there's an occurrence of that course with a higher mark.

    The problem with just checking whether the current term's mark is the highest so far is that term 1's math was the highest so far, but so was term 2's math, so they would both get counted.

    The second wrinkle is that in addition to accounting for that, you also can't just group by the term, because not all courses occur each term.

    For example, for term 2, you need the average of all the term 2 marks and all the term 1 marks except for term 1's math, because that math mark has been outscored in term 2.

    Similarly, the bio and chem marks in term 3 do not count towards the cumulative average, because the scores in term 2 were higher for those courses.

    That was how I understood the task, at least.

    Cheers!

    EDIT: I figured I would modify my query so I could show which marks were getting averaged for each term's cumulative average. Here are those results. Hopefully this helps make it clearer! I further edited it so that the results were also sorted by TermMarkAchieved, since that seemed easier to read.

    course HighestScoreSoFar CuTerm TermMarkAchieved

    ---------- ---------------------- ----------- ----------------

    math 3 1 1

    eng 4 1 1

    hist 4.5 1 1

    anth 4.5 1 1

    anth 4.5 2 1

    hist 4.5 2 1

    eng 4 2 1

    comp 3 2 2

    math 4 2 2

    bio 4.5 2 2

    chem 4.5 2 2

    eng 4 3 1

    hist 4.5 3 1

    anth 4.5 3 1

    chem 4.5 3 2

    bio 4.5 3 2

    comp 3 3 2

    soc 4.5 3 3

    math 4.5 3 3

  • First, is the right answer 4.3333 ? I changed the data type in your table to decimal(5, 4), as the float data type would be unable to accurately represent any mark value that ended in point 1 (e.g. 4.1 or 3.1, etc.). If I read your post correctly, you appear to just need the MAX value of a mark in a given course for a given student and course across ALL terms, and then just average all those maximum marks, as there will be exactly 1 per course, regardless of the number of terms. If I have that right, then here's the simple quiery:

    CREATE TABLE #gpa (

    [student] int,

    [term] int,

    [course] varchar(10),

    [mark] decimal(5,4)

    );

    INSERT INTO #gpa (student, term, course, mark)

    VALUES (1,1,'math',3),

    (1,1,'eng',4),

    (1,1,'hist',4.5),

    (1,1,'anth',4.5),

    (1,2,'comp',3),

    (1,2,'math',4),

    (1,2,'bio',4.5),

    (1,2,'chem',4.5),

    (1,3,'soc',4.5),

    (1,3,'math',4.5),

    (1,3,'bio',4),

    (1,3,'chem',4);

    WITH MAX_MARKS AS (

    SELECT G.student, G.course, MAX(G.mark) AS MAX_MARK

    FROM #gpa AS G

    GROUP BY G.student, G.course

    )

    SELECT G.student, MIN(G.term) AS STARTING_TERM, MAX(G.term) AS ENDING_TERM, AVG(M.MAX_MARK) AS AVERAGE_GPA,

    STUFF(

    (SELECT DISTINCT ', ' + CT.course

    FROM #gpa AS CT

    WHERE CT.student = G.student

    ORDER BY 1

    FOR XML PATH(''), TYPE).value('.', N'varchar(max)')

    , 1, 2, '') AS CourseList

    FROM #gpa AS G

    INNER JOIN MAX_MARKS AS M

    ON G.student = M.student

    AND G.course = M.course

    GROUP BY G.student

    DROP TABLE #gpa

    Here are the results:

    student STARTING_TERM ENDING_TERM AVERAGE_GPA CourseList

    ----------- ------------- ----------- --------------------------------------- --------------------------------------------------

    1 1 3 4.333333 anth, bio, chem, comp, eng, hist, math, soc

    the example I provided gave the desired results. I dont just need a final GPA but I need a cumulative GPA per term including the max mark for each course upto that term.

    In the example data GPA for term 1 is 4 (3:math,4:eng,4.5:hist,4.5:anth), GPA for term 2 is 4.14285.... (4;eng,4.5:hist,4.5:anth,3:comp,4:math,4.5:bio,4.5:chem), only the math with the highest grade is included. GPA for term 3 is 4.25 (4;eng,4.5:hist,4.5:anth,3:comp,4.5:math,4.5:bio,4.5:chem, 4.5:soc), only the math from term 3 and bio+chem from term two are used.

    term1 3+4+4.5+4.5 = 16/4(courses) = 4 GPA

    term2 4+4.5+4.5+3+4+4.5+4.5 = 29/7(courses) = 4.14285.... GPA

    term3 4+4.5+4.5+3+4.5+4.5+4.5+4.5 = 34/8(courses) = 4.25 GPA

    Bob
    -----------------------------------------------------------------------------
    How to post to get the best help[/url]

  • Yea... I see what you did. Sounds like you're the one with the correct interpretation of the requirements. Nice solution BTW. With a couple of indexes it would be easy to get rid of the 2 sort operators in the execution plan, making it not very expensive at all. Well done sir!

  • Thanks! Me, broken clocks, blind squirrels, etc. 🙂

Viewing 15 posts - 1 through 15 (of 23 total)

You must be logged in to reply to this topic. Login to reply