August 14, 2003 at 5:52 pm
i need to compute the median values for several columns of a bunch of tables, so i'm trying to write a stored procedure to compute medians as t-sql doesn't already have one. i've already dug up the code to do this from this site:
-----------------------------------
DECLARE @cnt INT
DECLARE @fpos INT
DECLARE @output FLOAT
SELECT @cnt = COUNT(*) FROM bocomp where debt is not null
IF ( @cnt % 2 ) = 0
BEGIN
SET @fpos = @cnt / 2
SELECT @output = avg(cast(debt as float)) FROM (
SELECT (
select count(*) + 1 from bocomp b where b.debt < a.debt and b.debt is not null
) AS ROW, a.debt FROM bocomp a where a.debt is not null
) AS Built
WHERE ROW IN (@fpos, @fpos + 1)
END
ELSE
BEGIN
SET @fpos = ( ( ( @cnt - 1 ) / 2 ) + 1 )
SELECT @output = debt FROM (
SELECT (
select count(*) + 1 from bocomp b where b.debt < a.debt and b.debt is not null
) AS ROW, a.debt FROM bocomp a where a.debt is not null
) AS Built
WHERE ROW = @fpos
END
SELECT @output AS MEDIAN
-----------------------------------
what i want to do is change each "bocomp" to "@cTable" (some kind of varchar variable) and each "debt" to "@cField", so i can put this in a stored procedure with those two parameters. when i try this directly, it doesn't work. also, when i try building each line as a string and executing it that way, it doesn't work because of the other variable names being worked with.
i'm a bit rusty with my t-sql (haven't worked with this in about a year), so any help is appreciated!
August 14, 2003 at 10:49 pm
I've tried another method of grabbing the middle rows. Using TOP to get the top half (and a bit) of the rows, and TOP again to get the bottom 1 or 2 rows of that sample.
It's a script at the moment (for testing purposes), but you can play with the @col and @tbl variables to see if it comes up with your expected values.
declare @rows int, @topn int, @samplen int, @sql nvarchar(1000), @median float
declare @col sysname set @col = 'id'
declare @tbl sysname set @tbl = 'sysobjects'
select @sql = 'select @rows = count(' + @col + ') from ' + @tbl + ' where ' + @col + ' is not null'
exec sp_executesql @sql, N'@rows int output', @rows OUTPUT
select @topn = case when @rows % 2 = 1 then 1 else 2 end
select @samplen = (@rows / 2) + 1
select @topn = case when @rows % 2 = 1 then 1 else 2 end
select @rows AS [Total Candidate Rows], @topn AS [Number of Median Rows], @samplen AS [Rows to Sample]
select @sql =
'select @median = avg(cast(' + @col + ' as float)) from
(select top ' + cast(@topn as varchar) + ' ' + @col + ' from
(select top ' + cast(@samplen as varchar) + ' ' + @col + ' from ' + @tbl + ' where ' + @col + ' is not null order by ' + @col + ') [half]
order by ' + @col + ' desc) [middle]'
exec sp_executesql @sql, N'@median float output', @median OUTPUT
select @median as MEDIAN
Hope it helps.
Cheers,
- Mark
Cheers,
- Mark
Viewing 2 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply