## Calculating Median, Max, Min, Lower Quartile and Upper Quartile for a Boxchart?

 Author Message DataAnalyst011 SSCrazy Group: General Forum Members Points: 2002 Visits: 530 Hi All - My boss asked me to create a boxchart using SSRS for some of our data. I read articles all morning but honestly, I'm at a loss on how accomplish this. I think I'm growing in my SQL knowledge, but this one has me stumped. I created a sample table that looks like what I'll be working with below. If anyone could lend a hand it would really help me!The output should be something like below with values filled in for x:`Year Median Max Min LowerQ UpperQ2010 x x x x x2011 x x x x x2012 x x x x x``CREATE TABLE EStats ( PersonID VARCHAR(30) NOT NULL, Grade VARCHAR(25) NOT NULL, CourseDate Date NOT NULL);INSERT INTO EStats( PersonID, Grade, CourseDate)VALUES ('100', '91', '2010-03-01'), ('101', '96', '2010-03-01'), ('102', '88', '2010-03-01'), ('103', '92', '2010-03-01'), ('104', '81', '2010-03-01'), ('105', '85', '2010-03-01'), ('106', '91', '2010-03-01'), ('107', '89', '2010-03-01'), ('108', '99', '2010-03-01'), ('109', '88', '2010-03-01'), ('110', '81', '2011-03-02'), ('111', '77', '2011-03-02'), ('112', '88', '2011-03-02'), ('113', '76', '2011-03-02'), ('114', '69', '2011-03-02'), ('115', '70', '2011-03-02'), ('116', '75', '2011-03-02'), ('117', '88', '2011-03-02'), ('118', '76', '2011-03-02'), ('119', '95', '2012-03-01'), ('120', '96', '2012-03-01'), ('121', '90', '2012-03-01'), ('122', '80', '2012-03-01'), ('123', '85', '2012-03-01'), ('124', '94', '2012-03-01'), ('125', '89', '2012-03-01'), ('126', '97', '2012-03-01'), ('127', '94', '2012-03-01'), ('128', '72', '2012-03-01'), ('129', '88', '2012-03-01'), ('130', '91', '2012-03-01')` bitbucket-25253 SSC-Dedicated Group: General Forum Members Points: 35141 Visits: 25280 Assume it is the median value that has you stumped, try reading this, it works for myself.http://www.mssqltips.com/sqlservertip/2523/script-to-calculate-the-median-value-for-sql-server-data/?utm_source=dailynewsletter&utm_medium=email&utm_content=headline&utm_campaign=20111030 If everything seems to be going well, you have obviously overlooked something. RonPlease help us, help you -before posting a question please read Before posting a performance problem please read Rich Mechaber SSCertifiable Group: General Forum Members Points: 5789 Visits: 3691 I had to do this as well, so you get my supply of bookmarks to help you on your way:http://www.sqlmag.com/article/tsql3/calculating-the-median-gets-simpler-in-sql-server-2005http://sqlblog.com/blogs/adam_machanic/archive/2006/12/18/medians-row-numbers-and-performance.aspxhttp://sqlblog.com/blogs/peter_debetta/archive/2006/12/20/Medians_Actual_Query_Cost_and_Statistics.aspxThe first one by Itzik Ben-Gan is the one I used.And lastly, a link to one of my prior posts here on SSC. It includes some add'l explanatory notes I wrote to remind myself how Ben-Gan's logic actually works:http://www.sqlservercentral.com/Forums/Topic923660-338-1.aspx#bm1105837HTH,Rich DataAnalyst011 SSCrazy Group: General Forum Members Points: 2002 Visits: 530 Awesome. Thanks a ton! DataAnalyst011 SSCrazy Group: General Forum Members Points: 2002 Visits: 530 Actually, I'm having some trouble converting over the example to my sample table above. I just took the grades for 03-01-2012. The median should be the average between 90 and 91 (so 90.5), correct? This gets that result set:`SELECT grade FROM estatsWHERE CourseDate = '2012-03-01'ORDER BY grade DESC `But when I run the below (which is an attempt to convert the example to my sample data above, I get 89.25. I'm sure I'm making a dumb mistake somewhere, but any help is appreciated.`WITH cte AS( SELECT EStats.PersonID, EStats.Grade, ROW_NUMBER() OVER(PARTITION BY EStats.PersonID ORDER BY EStats.GRADE) AS RN, COUNT(*) OVER(PARTITION BY EStats.PersonID) AS Cnt FROM EStats WHERE EStats.CourseDate = '2012-03-01')SELECT AVG(CAST(cte.Grade AS Numeric)) AS MedianFROM cteWHERE RN IN((cte.Cnt + 1) / 2, (Cnt + 2) / 2)` dwain.c SSC-Forever Group: General Forum Members Points: 44653 Visits: 6431 Maybe you need to:`PARTITION BY LEFT(CourseDate, 4)`? My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!My thought question: Have you ever been told that your query runs too fast?My advice:INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.Need to UNPIVOT? Why not CROSS APPLY VALUES instead?Since random numbers are too important to be left to chance, let's generate some!Learn to understand recursive CTEs by example.Splitting strings based on patterns can be fast!My temporal SQL musings: Calendar Tables, an Easter SQL, Time Slots and Self-maintaining, Contiguous Effective Dates in Temporal Tables DataAnalyst011 SSCrazy Group: General Forum Members Points: 2002 Visits: 530 Alright, its time to confess I'm still not getting this. Would anyone mind taking my original DDL and calculating median and lower/upper quartiles? I think if I can see it with my example data it would click for me. Any help is really appreciated! DataAnalyst011 SSCrazy Group: General Forum Members Points: 2002 Visits: 530 Below another attempt, but I seem to be running into the same problem. For instance, 2012-03-01 should produce 90.5, but the below gives me 89.25. `SELECT CourseDate, AVG(CAST(Grade AS Numeric)) FROM( SELECT CourseDate, Grade, ROW_NUMBER() OVER ( PARTITION BY PersonID ORDER BY Grade ASC, CourseDate ASC) AS RowAsc, ROW_NUMBER() OVER ( PARTITION BY PersonID ORDER BY Grade DESC, CourseDate DESC) AS RowDesc FROM EStats)xWHERE RowAsc IN (RowDesc, RowDesc - 1, RowDesc + 1)GROUP BY CourseDateORDER BY CourseDate`UPDATE: I think the problem is my PARTITION BY isn't working. In other words, the inner query produces a row_number with all 1's so there is nothing to match. Any ideas on a fix? DataAnalyst011 SSCrazy Group: General Forum Members Points: 2002 Visits: 530 Okay, using Dwain's suggestion I'm close to getting everything working. My only problem is the row_number in my inner query isn't working well when the numbers are the same. For instance, when you run my inner query...` SELECT CourseDate, Grade, ROW_NUMBER() OVER ( PARTITION BY LEFT(CourseDate, 4) ORDER BY Grade ASC) AS RowAsc, ROW_NUMBER() OVER ( PARTITION BY LEFT(CourseDate, 4) ORDER BY Grade DESC) AS RowDesc FROM EStats`...you'll notice it does this for 2010-03-01:RowAsc10986753421I think that is throwing off the snippet I added which Dwain suggested (THANKS, by the way!). Any suggestions on how to fix this?For those interested, here's what I have. I'm sure its about the worst way I could do this, but its all I've got at this stage in my SQL knowledge! Once I get the row_number ordering issue worked out, I think it will work.`WITH Q3 AS( SELECT CourseDate, AVG(CAST(Grade AS Numeric)) AS Median FROM ( SELECT CourseDate, Grade, ROW_NUMBER() OVER ( PARTITION BY LEFT(CourseDate, 4) ORDER BY Grade ASC) AS RowAsc, ROW_NUMBER() OVER ( PARTITION BY LEFT(CourseDate, 4) ORDER BY Grade DESC) AS RowDesc FROM EStats )x WHERE RowAsc IN (RowDesc, RowDesc - 1, RowDesc + 1) GROUP BY CourseDate --ORDER BY CourseDate),Q2 AS( SELECT x.CourseDate, AVG(CAST(Grade AS Numeric)) AS LowerQuartile FROM ( SELECT Estats.CourseDate, Estats.Grade, ROW_NUMBER() OVER ( PARTITION BY LEFT(EStats.CourseDate, 4) ORDER BY Grade ASC) AS RowAsc, ROW_NUMBER() OVER ( PARTITION BY LEFT(Estats.CourseDate, 4) ORDER BY Grade DESC) AS RowDesc FROM EStats JOIN Q3 on EStats.CourseDate = Q3.CourseDate WHERE EStats.Grade < Q3.Median )x WHERE RowAsc IN (RowDesc, RowDesc - 1, RowDesc + 1) GROUP BY x.CourseDate),Q4 AS( SELECT x.CourseDate, AVG(CAST(Grade AS Numeric)) AS UpperQuartile FROM ( SELECT Estats.CourseDate, Estats.Grade, ROW_NUMBER() OVER ( PARTITION BY LEFT(EStats.CourseDate, 4) ORDER BY Grade ASC) AS RowAsc, ROW_NUMBER() OVER ( PARTITION BY LEFT(Estats.CourseDate, 4) ORDER BY Grade DESC) AS RowDesc FROM EStats JOIN Q3 on EStats.CourseDate = Q3.CourseDate WHERE EStats.Grade > Q3.Median )x WHERE RowAsc IN (RowDesc, RowDesc - 1, RowDesc + 1) GROUP BY x.CourseDate)SELECT Q3.CourseDate, Q3.Median AS Median, Q2.LowerQuartile, Q4.UpperQuartile, MIN(EStats.Grade) AS Min, MAX(EStats.Grade) AS MaxFROM Q3 JOIN Q2 ON Q3.CourseDate = Q2.CourseDate JOIN Q4 ON Q3.CourseDate = Q4.CourseDate JOIN EStats ON Q3.CourseDate = EStats.CourseDateGROUP BY Q3.CourseDate, Q3.Median, Q2.LowerQuartile, Q4.UpperQuartileORDER BY Q3.CourseDate` DataAnalyst011 SSCrazy Group: General Forum Members Points: 2002 Visits: 530 I'm just going to bump this because I rewrote the above post and added what I've got to date...