SQL Help in grouping

  • Hello,

    SQL Server Version 2000

    create table students

    (

    course_id varchar(5),

    course_name varchar(10),

    roll_no varchar(3),

    student_name varchar(40),

    geog_marks varchar(2),

    math_marks varchar(2)

    )

    insert into students values ('1','course1','1','abc','40','')

    insert into students values ('1','course1','2','aaa','40','')

    insert into students values ('1','course1','3','bbb','40','')

    insert into students values ('2','course2','1','abc','','50')

    insert into students values ('2','course2','2','aaa','','50')

    insert into students values ('2','course2','3','bbb','45','')

    insert into students values ('3','course3','1','abc','','50')

    insert into students values ('4','course4','1','abc','45','50')

    insert into students values ('4','course4','2','aaa','45','50')

    insert into students values ('4','course4','3','bbb','45','50')

    insert into students values ('5','course5','1','abc','55','35')

    insert into students values ('5','course5','2','aaa','45','35')

    Above is sample of rows in my table (actually it is not a table, the output of above select * from students is the output of my one query). I need below output please :

    course_idcourse_namerollnodescriptiongeogmath

    1course11Default40

    2course21Default50

    2course23bbb45

    3course31abc50

    4course41Default4550

    5course51abc5535

    5course52aaa4535

    Logic behind required output :

    1.In course_id 1, all student got 40 marks in geog and no marks in maths, so output should be like above.

    2.In course_id 2, all student except rollno 3 got 50 marks in maths and rollno 3 got 45 (not equal to 50), so for all 2 student there should be "Default" word in description column and for rollno 3, his name and marks in geog subject.

    3.In course_id 3, there is only one student, so no grouping, so simple show of all the info as above.

    4.In course_id 4, all student got same marks in both the subjects, so it will be treated as course_id 1 above.

    5.In course_id 5, two student got different marks in both the subjects, so there are no rows greater than 1 for whom we can use "Default" word, so output should be like above.

    "Default" word will be used only for those rows in which there are more than one row which have same marks in both or any one subject in a course.

    In the continuation of above output, I wish to know in which courses there are only "Default" marks got by students and "Non-Default" marks got by student something like this seperately :

    List of Courses in which student got Default marks:

    course_idcourse_namerollnodescriptiongeogmath

    1course11Default40

    4course41Default4550

    3course31abc50 <- Since only one student, so it can be treated as default

    And,

    List of Courses in which student got Non-Default marks:

    course_idcourse_namerollnodescriptiongeogmath

    2course21Default50

    2course23bbb45

    5course51abc5535

    5course52aaa4535

    I am using SQL Server 2000, this is big problem with me, if I were using version 2005 and or later one, then it could be solved in easily with over...(partition by) clause, but since we can not move to the higher version, so kindly help me how do I write SQL to get above output.

    Kindly let me know, if I am unclear in my question and/or provide more clarification for the required output. Generally, when we see a question related to rollno, course, school name etc. it is treated as a class work, but above is real time data in our corporate school database, so please help me.

    For the solution, I have got plenty of examples but all are using version 2005 or greater than, but I am using version 2000, so it forced me write question here.

    Thanks and Regards

    Girish Sharma

  • I am able to get 1st and 2nd output from below query :

    For 1st output :

    select

    a.course_id

    ,a.[Course]

    ,a.[Rollno]

    ,a.Description

    ,a.[Geog Marks]

    ,a.[Math Marks]

    from

    (

    select

    course_id

    ,min(course_name) as 'Course'

    ,min(roll_no) as 'Rollno'

    ,case when count(course_id)>1 then min('Default') else min(student_name) end as Description

    ,geog_marks as 'Geog Marks'

    ,math_marks as 'Math Marks'

    from students

    group by course_id,geog_marks,math_marks

    ) a

    group by

    a.course_id

    ,a.[Course]

    ,a.[Rollno]

    ,a.Description

    ,a.[Geog Marks]

    ,a.[Math Marks]

    and for 2nd output :

    select

    a.course_id

    ,min(a.course) as 'Course'

    ,min(a.rollno) as 'Rollno'

    ,min(a.description) as 'Description'

    ,min(a.[Geog Marks]) as 'Geog Marks'

    ,min(a.[Math Marks]) as 'Math Marks'

    from

    (

    select

    course_id

    ,min(course_name) as 'Course'

    ,min(roll_no) as 'Rollno'

    ,case when count(course_id)>1 then min('Default') else min(student_name) end as Description

    ,min(student_name) as 'Student Name'

    ,min(geog_marks) as 'Geog Marks'

    ,min(math_marks) as 'Math Marks'

    from students

    group by course_id,geog_marks,math_marks

    ) a

    group by a.course_id

    having count(a.course_id)=1

    For 3rd,

    but I am not able to fetch only course 2 and 5's rows from 1st query. Kindly help me how do I get 3rd output.

    Regards

    Girish Sharma

  • Yes, now I am able to get 3rd output by using Whole query output Except 2nd query output :

    select

    a.course_id

    ,a.[Course]

    ,a.[Rollno]

    ,a.Description

    ,a.[Geog Marks]

    ,a.[Math Marks]

    from

    (

    select

    course_id

    ,min(course_name) as 'Course'

    ,min(roll_no) as 'Rollno'

    ,case when count(course_id)>1 then min('Default') else min(student_name) end as Description

    ,geog_marks as 'Geog Marks'

    ,math_marks as 'Math Marks'

    from students

    group by course_id,geog_marks,math_marks

    ) a

    group by

    a.course_id

    ,a.[Course]

    ,a.[Rollno]

    ,a.Description

    ,a.[Geog Marks]

    ,a.[Math Marks]

    except

    select

    a.course_id

    ,min(a.course) as 'Course'

    ,min(a.rollno) as 'Rollno'

    ,min(a.description) as 'Description'

    ,min(a.[Geog Marks]) as 'Geog Marks'

    ,min(a.[Math Marks]) as 'Math Marks'

    from

    (

    select

    course_id

    ,min(course_name) as 'Course'

    ,min(roll_no) as 'Rollno'

    ,case when count(course_id)>1 then min('Default') else min(student_name) end as Description

    ,min(student_name) as 'Student Name'

    ,min(geog_marks) as 'Geog Marks'

    ,min(math_marks) as 'Math Marks'

    from students

    group by course_id,geog_marks,math_marks

    ) a

    group by a.course_id

    having count(a.course_id)=1

    Query and output is fine, but I am sure it can be done more efficiently.

    Regards

    Girish Sharma

  • SELECT

    s.course_id, s.course_name,

    roll_no = MIN(s.roll_no),

    student_name = CASE WHEN g.SetSize = 1 THEN MIN(s.student_name) ELSE 'Default' END,

    s.geog_marks, s.math_marks

    FROM #students s

    INNER JOIN (

    SELECT course_id, geog_marks, math_marks, SetSize = COUNT(*)

    FROM #students

    GROUP BY course_id, geog_marks, math_marks

    ) g

    ON g.course_id = s.course_id

    AND g.geog_marks = s.geog_marks

    AND g.math_marks = s.math_marks

    GROUP BY s.course_id, s.course_name, s.geog_marks, s.math_marks, g.SetSize

    ORDER BY s.course_id, roll_no

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Hi Chris,

    Thanks for your reply. But I am getting :

    Invalid object name '#students'.

    and what is g.SetSize please?

    Regards

    Girish Sharma

  • #students is a local temporary table I used instead of the permanent sample table students. Just remove the #.

    g.SetSize is the calculated row count of the set in the partition (the GROUP BY group).

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Thanks Chris. This is giving me output of 1st query. Can you please help me how do I get rest 2nd and 3rd query outputs efficiently. Even though I have posted mine way, but I am sure they are not efficient and will work slow if there are thousand of rows in the table.

    Regards

    Girish Sharma

  • The results of my query match your desired output table. What do you want me to do?

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Yes, result of your query match with 1st query output. In my original question, I have posted two more required output based upon output of 1st query. I am just following to get it efficiently (for next two queries) please.

    Regards

    Girish Sharma

  • -- like this

    SELECT

    s.course_id, s.course_name,

    roll_no = MIN(s.roll_no),

    student_name = CASE WHEN g.SetSize = 1 THEN MIN(s.student_name) ELSE 'Default' END,

    s.geog_marks, s.math_marks

    FROM #students s

    INNER JOIN (

    SELECT course_id, geog_marks, math_marks, SetSize = COUNT(*)

    FROM #students

    GROUP BY course_id, geog_marks, math_marks

    ) g

    ON g.course_id = s.course_id

    AND g.geog_marks = s.geog_marks

    AND g.math_marks = s.math_marks

    GROUP BY s.course_id, s.course_name, s.geog_marks, s.math_marks, g.SetSize

    HAVING (CASE WHEN g.SetSize = 1 THEN MIN(s.student_name) ELSE 'Default' END) = 'Default'

    ORDER BY s.course_id, roll_no

    -- or run the result set into a temporary table:

    IF object_id('TempDB..#Results') IS NOT NULL

    DROP TABLE #Results

    SELECT

    s.course_id, s.course_name,

    roll_no = MIN(s.roll_no),

    student_name = CASE WHEN g.SetSize = 1 THEN MIN(s.student_name) ELSE 'Default' END,

    s.geog_marks, s.math_marks

    INTO #Results

    FROM #students s

    INNER JOIN (

    SELECT course_id, geog_marks, math_marks, SetSize = COUNT(*)

    FROM #students

    GROUP BY course_id, geog_marks, math_marks

    ) g

    ON g.course_id = s.course_id

    AND g.geog_marks = s.geog_marks

    AND g.math_marks = s.math_marks

    GROUP BY s.course_id, s.course_name, s.geog_marks, s.math_marks, g.SetSize

    ORDER BY s.course_id, roll_no

    SELECT *

    FROM #Results

    SELECT *

    FROM #Results

    WHERE student_name = 'Default'

    SELECT *

    FROM #Results

    WHERE student_name <> 'Default'

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Thank you Chris for your continue support. Probably I am not able to find a bug in the query but when I am saying :

    SELECT *

    FROM #Results

    WHERE student_name <> 'Default'

    which is not returning me the required output No. 3, same when I am saying :

    SELECT *

    FROM #Results

    WHERE student_name = 'Default'

    is not returning me the required output No. 2, while

    SELECT *

    FROM #Results

    is giving me the correct oupput of required output No. 1.

    I am really thankful to you for your valuable code which is simultaneously giving me the learning of SQL too (in which I am very much poor)

    Regards

    Girish Sharma

  • gksharmaajmer (10/2/2013)


    Thank you Chris for your continue support. Probably I am not able to find a bug in the query but when I am saying :

    SELECT *

    FROM #Results

    WHERE student_name <> 'Default'

    which is not returning me the required output No. 3, same when I am saying :

    SELECT *

    FROM #Results

    WHERE student_name = 'Default'

    is not returning me the required output No. 2, while

    SELECT *

    FROM #Results

    is giving me the correct oupput of required output No. 1.

    I am really thankful to you for your valuable code which is simultaneously giving me the learning of SQL too (in which I am very much poor)

    Regards

    Girish Sharma

    You're welcome.

    Apart from the first row of your third table (which I don't think should be there), the queries appear to generate the correct results from your sample data. If you are getting different results then describe what you see - check using the sample data set you've provided which may differ from your actual data. If this is the case, then set up a new sample data set which demonstrates.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • gksharmaajmer (10/2/2013)


    Thank you Chris for your continue support. Probably I am not able to find a bug in the query but when I am saying :

    SELECT *

    FROM #Results

    WHERE student_name <> 'Default'

    which is not returning me the required output No. 3, same when I am saying :

    SELECT *

    FROM #Results

    WHERE student_name = 'Default'

    is not returning me the required output No. 2, while

    SELECT *

    FROM #Results

    is giving me the correct oupput of required output No. 1.

    I am really thankful to you for your valuable code which is simultaneously giving me the learning of SQL too (in which I am very much poor)

    Regards

    Girish Sharma

    You need to translate your requests to data language:

    "List of Courses in which student got Default marks:" - in fact means "List of courses with a single entry in the final dataset" (table #Result);

    and

    "List of Courses in which student got Non-Default marks:" - means "List of courses with more than 1 entry in the final dataset".

    Should be easy task from here.

    _____________
    Code for TallyGenerator

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply