Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 12»»

Query Help to Match Multiple Results and Average Expand / Collapse
Author
Message
Posted Monday, March 3, 2014 8:26 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Monday, March 10, 2014 2:27 PM
Points: 6, Visits: 10
I work for a school district and need some help with a new requirement we were just given for scoring some student scores.

Everything will currently be in one table to keep track of students test scores for various things. This table will contain students information and a student will be shown more than once in the table. The Student ID is what we will key off of to find the multiple instances of the student. The table contains the following columns: studentName, StudentId, teacherName, focus1, controllingIdeas1, reading1, development1, organization1, conventions1, and contentUnderstanding1. All of the columns with a 1 at the end will be numeric values with possible decimal values.

What we need to be able to do is some how perform a search for these multiple entries of each student and when found, average the 2 scores for each 7 test categories. The result needs to be a single line for each student that gives the student name, student id, and the 7 test category averages exported to an csv file.

Any help of where to start and how to accomplish this task is greatly appreciated.
Post #1547146
Posted Tuesday, March 4, 2014 12:05 AM
SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Thursday, July 17, 2014 7:01 AM
Points: 470, Visits: 475
Could you please post DDL for table, some sample set and the expected O/P for the same ? This will help us understand and to work on the problem better.
Post #1547168
Posted Friday, March 7, 2014 3:48 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Monday, March 10, 2014 2:27 PM
Points: 6, Visits: 10
I have attached the DDL for the database.

I do not have an example of the output data, but I'll see if I can explain it a little easier now that the DDL has been created.
We need to do a query for all duplicate studentID values. Once found, we need to then average all focus1, controlling1, reading1, etc. values for those duplicate studentIDs. Once the averages are done we then need to spit out the studentName, StudentID, focus1Average, controlling1Average, reading1Average, etc. to a csv file. I believe the export to a csv is easy after the rest of the query is done since I can just right click and save as a csv. It's the query that I am struggling with.


  Post Attachments 
LDCdatabase.zip (5 views, 1.73 KB)
Post #1548920
Posted Saturday, March 8, 2014 9:11 PM
Say Hey Kid

Say Hey KidSay Hey KidSay Hey KidSay Hey KidSay Hey KidSay Hey KidSay Hey KidSay Hey Kid

Group: General Forum Members
Last Login: Yesterday @ 1:04 PM
Points: 706, Visits: 4,507
Can you post some dummy data so we can see what your data looks like? Are you importing this from somewhere or entering it manually? The reason I ask is that the structure is somewhat puzzling. Why are all the columns defined as floats? (Even a StudentID? Normally, that's an INT).

Once you normalize, querying/summarizing this thing would be a walk in the park. Any reason you can't do that? What if you need to add a third set of values for each student? When I see structures like this, I worry.

Do you have control over the structure of the database? I get the feeling it could use some normalization. Then querying it for answers should get a lot easier.
Post #1549022
Posted Sunday, March 9, 2014 9:00 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Monday, March 10, 2014 2:27 PM
Points: 6, Visits: 10
Attached is some dummy data. We are importing the data using the SQL Data Import utility from excel spreadsheets. Normalizing is something we might do later down the road, but because of the way we are receiving the data and our time restraint to get the averages out, and not being SQL experts, we are not that concerned about it right now. The data types that are setup are what the import utility set them to, so to make things easier right now we decided to stick with them. There very well could be more than 2 results for the same student. We do have control over the database as we are hosting it locally.

  Post Attachments 
LDCScores.zip (5 views, 73.63 KB)
Post #1549095
Posted Monday, March 10, 2014 7:50 AM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Yesterday @ 7:32 PM
Points: 1,212, Visits: 6,550
In SSMS, go to Help... View Help.
This will pull up Books On Line.
Typing Average in the upper right search box will bring up relevant topics.

If time is of one of the main factors, a pivot table directly in Excel can be added in less than a minute.

Note in your sample data, 2 of the Students have no ID - which is the key to the Student.

Although I am glad you are trying to learn, something just seems odd.
Something as important as grades, and this is such a basic query question.
And a school district managing some of it in Excel?
Do they have any openings?
Post #1549231
Posted Monday, March 10, 2014 9:31 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Monday, March 10, 2014 2:27 PM
Points: 6, Visits: 10
IT wasn't notified of the project until last minute. An excel spreadsheet was already distributed out to teachers to submit their scores with. It's not the ideal situation, but it's what we have and are trying to work with it. Next year we hope to have a better solution in place. The scores are not for normal grades. I go into the long history of why we are at this point, but that doesn't help solve our issue at hand. I am aware that the data is not clean yet. We will be doing that once we have all of the data imported into SQL.

I understand how to use the AVG() function in SQL. The part that I am having problems with is finding all duplicate studentID's and then after averaging the results, spitting it out so it only lists the student information once with the averages.
Post #1549298
Posted Monday, March 10, 2014 11:12 AM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Yesterday @ 7:32 PM
Points: 1,212, Visits: 6,550
bsmith 63193 (3/10/2014)
IT wasn't notified of the project until last minute. An excel spreadsheet was already distributed out to teachers to submit their scores with. It's not the ideal situation, but it's what we have and are trying to work with it. Next year we hope to have a better solution in place. The scores are not for normal grades. I go into the long history of why we are at this point, but that doesn't help solve our issue at hand. I am aware that the data is not clean yet. We will be doing that once we have all of the data imported into SQL.

I understand how to use the AVG() function in SQL. The part that I am having problems with is finding all duplicate studentID's and then after averaging the results, spitting it out so it only lists the student information once with the averages.


Using Group By student ID and name, avg() on the other columns, duplicates are averaged.
If only 1 record, score/count of 1 is the score for that column.
Add Count(*) as a column and validate some of the data.

Our you can do a Count(*) on student ID and name, and use a having statement > 1 to see all the dupes.
Maybe I'm missing something, but you seem to be overthinking some of this.

Post #1549392
Posted Monday, March 10, 2014 12:00 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Monday, March 10, 2014 2:27 PM
Points: 6, Visits: 10
Attached is a spreadsheet that depicts what we have in SQL and what the output needs to be based on that information. It is a very small sample, but you can see that there are several students with multiple entries and some students that have a single entry. I hope this helps explain what we are trying to accomplish. I know I can do this in Excel as I have shown, but we would prefer to be able to accomplish this in SQL if possible.

  Post Attachments 
LDCScoresOutput.xlsx (12 views, 12.05 KB)
Post #1549423
Posted Monday, March 10, 2014 1:22 PM This worked for the OP Answer marked as solution


SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Today @ 3:59 AM
Points: 1,886, Visits: 18,540
maybe of some use


with cte_names as
(
SELECT StudentID, MAX(StudentName) AS StudentName
FROM lcdscores
GROUP BY StudentID
)


SELECT lcdscores.StudentID
, cte_names.StudentName
, CAST ( AVG ( lcdscores.focus1 ) AS decimal ( 9 , 2 )) AS focus1
, CAST ( AVG ( lcdscores.cont1 ) AS decimal ( 9 , 2 )) AS controlling1
FROM
lcdscores INNER JOIN cte_names ON lcdscores.StudentID = cte_names.StudentID
GROUP BY lcdscores.StudentID
, cte_names.StudentName;





__________________________________________________________________
you can lead a user to data....but you cannot make them think !
__________________________________________________________________
Post #1549451
« Prev Topic | Next Topic »

Add to briefcase 12»»

Permissions Expand / Collapse