Calculating a median in SQL Server

  • SQL Server does not come with a median or mode (for that matter) function outside of Analysis Services. Does anyone know of a good T-SQL method of implementing a true median. I'd prefer not having to use temporary tables but I'm not ruling them out.

    A median is defined as the middle value of a set of numbers when the data are ranked in order. The position of the median value is determined by (n plus 1)/2 where n is the number of elements in the set and it can be halfway between two values in the set if n is not even. The mode is the value in the set that occurs most frequently.

    Thanks!

  • Try this. Not sure it is the best or always works.

    create table MyTable

    ( MyID int

    )

    go

    insert MyTable select 1

    insert MyTable select 3

    insert MyTable select 5

    insert MyTable select 12

    insert MyTable select 15

    insert MyTable select 7

    insert MyTable select 17

    go

    select *

    from MyTable

    declare @i int

    select @i = count(*)

    from MyTable

    if @i % 2 = 0

    select @i = @i / 2

    else

    select @i = (@i / 2) + 1

    select @i

    set rowcount @i

    select @i = MyID

    from MyTable

    order by MyID

    select @i

    drop table MyTable

    Steve Jones

    steve@dkranch.net

  • That is an interesting approach, Steve. I notice that I did steer you wrong in my definitions above. The median can be halfway in between two elements of the set if n IS even. So if our number set was 1,3,5,12,15,7,17, and 8 then the median would be at the 4.5th position and so would be halfway between 7 and 8 and so would equal 7.5

  • Steve,

    Expanding your method I have a solution which, I think, will work (even if it is rather clunky!):

    create table MyTable
    
    ( MyID int
    )
    go
    insert MyTable select 1
    insert MyTable select 3
    insert MyTable select 5
    insert MyTable select 12
    insert MyTable select 15
    insert MyTable select 7
    insert MyTable select 17
    insert MyTable select 8
    go
    select *
    from MyTable

    declare @n int
    declare @n2 int
    declare @median decimal(9,2)
    select @n = count(*)

    from MyTable

    if (@n +1) % 2 = 0
    begin
    select @n = (@n+1) / 2
    select @n
    select @n2=@n+1
    set rowcount @n
    select @n = MyID
    from MyTable
    order by MyID
    select @n
    end

    else
    begin
    select @n = (@n+1) / 2
    select @n
    select @n2=@n+1
    set rowcount @n
    select @n = MyID
    from MyTable
    order by MyID
    select @n
    set rowcount @n2
    select @n2 = MyID
    from MyTable
    order by MyID
    select @n2
    select @median = (cast(@n as decimal(9,2)) + cast(@n2 as decimal(9,2)))/2.0
    select @median
    end

    drop table MyTable
    set rowcount 0

    Edited by - tgrignon@compusense.com on 01/29/2002 3:39:57 PM

  • Another approach:

    DECLARE @n int

    DECLARE @SQL nvarchar(255)

    SET @n = (SELECT COUNT(*) FROM MyTable)

    IF @n % 2 = 0

    IF @n = 2

    SET @SQL = 'SELECT AVG(MyID) Median FROM MyTable'

    ELSE

    SET @SQL = 'SELECT AVG(MyId) Median FROM (SELECT TOP 2 MyID FROM '

    + '(SELECT TOP ' + CAST(((@n / 2) + 1) AS varchar)

    + ' MyID FROM MyTable ORDER BY MyID DESC) A ORDER BY MyID ASC) B'

    ELSE

    SET @SQL = 'SELECT TOP 1 MyID Median FROM (SELECT TOP '

    + CAST(((@n / 2) + 1) AS varchar)

    + ' MyID FROM MyTable ORDER BY MyID ASC) A ORDER BY MyID DESC'

    EXEC(@SQL)

    K. Brian Kelley

    bkelley@sqlservercentral.com

    http://www.sqlservercentral.com/columnists/bkelley/

    K. Brian Kelley
    @kbriankelley

  • Thanks Brian! That approach seems to work and is considerably more elegant than my approach. There is one assumption that seems to be important with it, though, and that is that the numbers that come into it are decimal or float. If the numbers are integer this strategy will not give the decimal portion if it is present.

  • What happens if we replace AVG(MyID) with AVG(CAST(MyID AS Decimal(9,2))) to ensure SQL is averaging a decimal value?

    K. Brian Kelley

    bkelley@sqlservercentral.com

    http://www.sqlservercentral.com/columnists/bkelley/

    K. Brian Kelley
    @kbriankelley

  • That would do the trick!!! Thanks again.

  • I'm struggling with trying to create a UDF out of Brian's solution. Is it possible to pass a recordset to an UDF and return the median? I was hoping to use this UDF as an aggregate function but I have doubts that this is possible. Anyone know?

  • Thought this would work, but the table datatype (from bol)

    table

    A special data type used to store a result set for later processing. The table data type can be used only to define local variables of type table or the return value of a user-defined function.

    You might be able to get this to work, but would need a temp table and reference that from the UDF. Or a perm table.

    Steve Jones

    steve@dkranch.net

  • It is an old listing but I need this solution for getting Median along with other aggregates in my query single select statement based on specific group by clause.

    Anybody got any brighter ideas to add to Brian's solution so that it can be efficiently used like AVG function.

    I wonder why MS didn't provide it :

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply