TSQL and percentile (not percentilerank :)

  • Hi all,

    I would like to ask for some help and maybe the best practices on calculating the 90th percentile using TSQL. To be clear after searching online i'm looking for one value to show based on various numbers during a time frame. Example what is the 90th percentile for students duration of an exam for the week of July 1st.

    I believe i have figured out the formula below (which excludes NULLS)

    select max(case when rownum*1.0/numrows <= 0.9 then <column> end) as percentile_90th

    from (select <column>,

    row_number() over (order by <column>) as rownum,

    count(*) over (partition by NULL) as numrows

    from <table>

    where <column> is not null AND DATE = '12-17-2012' (etc)

    ) t

    There are now 2 items Iā€™m still working on to complete this query at my workplace and that is:

    1 ā€“ get 2 decimal places as result (right now the output has no decimal places)

    2 ā€“ integrate the Date to be dynamic (for last week for example) -> replace static with DATE >= CONVERT(date, GETDATE() -8)

    Before i go ahead and spend more time i wanted to see what the community has for the best method to calculate percentile? i did a spot check and extracted the raw data and found this calculation is right on except for 1 date range i used and i'm not sure why right now.

    Any advice is appreciated!

  • so it seems like i spoke too soon, this formula only provides you with the row in the 90th percentile not the value.

    What i'm looking for is a 90th percentile value šŸ™

  • If you can post some ddl, sample data and desired output you will find lots of people diving in to this one. Please see the first article in my signature for best practices when posting questions.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 ā€“ Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Thank you for the information. I would like to delete this topic as i have found my answer. After reading a bit it seems that we can't delete our own topics?

  • cs_source (12/31/2012)


    Thank you for the information. I would like to delete this topic as i have found my answer. After reading a bit it seems that we can't delete our own topics?

    Having asked a question and upon finding your own answer, it is actually good forum etiquette to post how you solved your problem. Others may come along with the same or similar question and your answer may help them.

  • Was your solution something like this?

    ;WITH Scores (Score, Percentile) AS (

    SELECT Score, NTILE(10) OVER(ORDER BY Score)

    FROM sys.all_columns

    CROSS APPLY (SELECT 51+ ABS(CHECKSUM(NEWID())) % 50) a(Score)

    )

    SELECT Score

    FROM Scores

    WHERE Percentile > 9


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Hi Lynn & Dwayne,

    I spent a few days trying to find an answer and found the closest value to be the first query i posted, however that query doesn't work for what i need. At this time i decided to go back to SAS to calculate the percentile, however i will be tackling this once i have a few things off my plate.

    If you like i can post the SAS code? šŸ™‚

  • cs_source (1/3/2013) If you like i can post the SAS code? šŸ™‚

    Not the code for percentile, but if you can figure out a way to use OPENROWSET with the SAS OLE providers I would sure appreciate it. I am able to use VBA in Excel to read SAS files, and to use the import wizard in SQL Server to import .sas7bdat files, but for the life of me I can't get OPENROWSET to work with the same SAS providers that Excel has no problem with.

    Greg
    _________________________________________________________________________________________________
    The glass is at one half capacity: nothing more, nothing less.

  • At this time i only use SAS for the percentile not for any other function so your solution is one step further than mine. When i review this hopefully within the next two weeks i'll focus on just getting this done in SQL.

  • cs_source (1/3/2013)


    At this time i only use SAS for the percentile not for any other function so your solution is one step further than mine. When i review this hopefully within the next two weeks i'll focus on just getting this done in SQL.

    Well, while I'm not intending to steer anyone away from SQL Server, as I use it for 95% of handling data, if you have a license for SAS, you would be well served to start learning how to program in it, as it is an extremely powerful tool, and because of the exorbitant annual license fee, most folks don't have access to it. It is a good thing to be able to put on your resume.

    Greg
    _________________________________________________________________________________________________
    The glass is at one half capacity: nothing more, nothing less.

  • Greg Snidow (1/3/2013)


    cs_source (1/3/2013) If you like i can post the SAS code? šŸ™‚

    Not the code for percentile, but if you can figure out a way to use OPENROWSET with the SAS OLE providers I would sure appreciate it. I am able to use VBA in Excel to read SAS files, and to use the import wizard in SQL Server to import .sas7bdat files, but for the life of me I can't get OPENROWSET to work with the same SAS providers that Excel has no problem with.

    Given the power of SAS, if you can ever make it work it sounds like the makings of a great article!


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • dwain.c (1/3/2013) Given the power of SAS, if you can ever make it work it sounds like the makings of a great article!

    Indeed, I believe it would be. I find it quite annoying to have to use the import wizard. Per the SAS documentation, not all of the 4 providers support SQL operations, so there is that issue. However, I tend to think that the way an ADO recordset handles the data access would be similar in nature to the way OPENROWSET would handle it. Of course, that is just my totally uninformed speculation. I'll keep plugging away at it, and if I do come up with a solution I'll certainly share the wealth. One problem is that you only get the providers by having SAS installed on your machine. Now, SAS does provide just the providers for free, but I have been unable to get them to work on my personal machine, where I don't have SAS installed.

    Greg
    _________________________________________________________________________________________________
    The glass is at one half capacity: nothing more, nothing less.

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply