Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 12»»

Effective T-SQL for Median Expand / Collapse
Author
Message
Posted Tuesday, July 30, 2013 4:20 PM


SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Yesterday @ 6:43 AM
Points: 459, Visits: 426
Is this an effective way to find median?

CREATE TABLE #T (i TINYINT);

GO

INSERT INTO #T
SELECT ABS(CHECKSUM(NEWID())) % 250

GO 1000

;WITH t AS (SELECT i, ROW_NUMBER() OVER (ORDER BY i) AS n, count(*) OVER (PARTITION BY (SELECT NULL)) AS c FROM #T)
SELECT
AVG(i) AS Median
FROM t
WHERE
( c % 2 = 1 AND n = ((c-1)/2)+1 )
OR (c % 2 = 0 AND (n = c / 2 OR n = c / 2 + 1))

Post #1479218
Posted Tuesday, July 30, 2013 4:52 PM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Yesterday @ 3:37 PM
Points: 3,374, Visits: 7,300
Last week I came up with this option. I haven't made performace tests. Maybe you could help me with that.

DECLARE @test	TABLE(
myint int)
INSERT INTO @test VALUES(1),(2),(3),(4),(5),(6),(7),(8),(9);

WITH CTE AS(
SELECT myint,
NTILE( 2) OVER( ORDER BY myint) tile
FROM @test
)
SELECT CASE WHEN SUM( CASE WHEN tile = 1 THEN 1 END) = SUM( CASE WHEN tile = 2 THEN 1 END)
THEN (MAX( CASE WHEN tile = 1 THEN myint END) + MIN( CASE WHEN tile = 2 THEN myint END))/2.0
ELSE MAX( CASE WHEN tile = 1 THEN myint END) END
FROM CTE




Luis C.
I am a great believer in luck, and I find the harder I work the more I have of it. Stephen Leacock

Forum Etiquette: How to post data/code on a forum to get the best help
Post #1479227
Posted Tuesday, July 30, 2013 6:06 PM


SSCarpal Tunnel

SSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal Tunnel

Group: General Forum Members
Last Login: Friday, July 25, 2014 3:19 AM
Points: 4,320, Visits: 6,113
Fortunately for you SQL 2012 has much better Windowing Function support. Look into the PERCENTILE_CONT function. I also HIGHLY recommend you purchase Itzik Ben-Gan's SQL Server 2012 High-Performance TSQL Using Window Functions book.

Best,

Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru at GMail
Post #1479235
Posted Tuesday, July 30, 2013 6:48 PM


SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Yesterday @ 6:43 AM
Points: 459, Visits: 426
TheSQLGuru (7/30/2013)
Fortunately for you SQL 2012 has much better Windowing Function support. Look into the PERCENTILE_CONT function. I also HIGHLY recommend you purchase Itzik Ben-Gan's SQL Server 2012 High-Performance TSQL Using Window Functions book.


This is awesome thanks!
Post #1479246
Posted Wednesday, July 31, 2013 3:04 PM


Mr or Mrs. 500

Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500

Group: General Forum Members
Last Login: Monday, July 28, 2014 4:49 AM
Points: 556, Visits: 2,581
dkschill (7/30/2013)
Is this an effective way to find median?

CREATE TABLE #T (i TINYINT);

GO

INSERT INTO #T
SELECT ABS(CHECKSUM(NEWID())) % 250

GO 1000

;WITH t AS (SELECT i, ROW_NUMBER() OVER (ORDER BY i) AS n, count(*) OVER (PARTITION BY (SELECT NULL)) AS c FROM #T)
SELECT
AVG(i) AS Median
FROM t
WHERE
( c % 2 = 1 AND n = ((c-1)/2)+1 )
OR (c % 2 = 0 AND (n = c / 2 OR n = c / 2 + 1))



First and foremost, for your sample data, don't use GO 1000. You will find a tally table works much better for this kind of thing.

WITH t(i) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 0)) FROM sys.all_objects)
INSERT INTO #T
SELECT ABS(CHECKSUM(NEWID()))%200
FROM t
WHERE i<=1000;



My non-2012 solution is:

;WITH t(i,n,c) AS
( SELECT i,
ROW_NUMBER() OVER (ORDER BY i),
COUNT(*) OVER (PARTITION BY (SELECT NULL))
FROM #T)
SELECT
AVG(i) AS Median
FROM t
WHERE (c%2=1 AND n=((c-1)/2)+1) OR (c%2=0 AND (n=c/2 OR n=c/2+1));

I tested this against Louis' solution (Brilliant use of NTILE BTW) and, though it appears to produce a better query plan (more parallelism, lower est. query cost) it runs about the same speed as what Louis put together (~7 seconds/1 million rows on my crappy test box [4-cpu, 4gb ram test]).

Edit: code formatting messed up.


-- Alan Burstein



Read this article for best practices on asking questions.
Need to split a string? Try this (Jeff Moden)
Need a pattern-based string spitter? Try this (Dwain Camps)
My blog
Post #1479711
Posted Wednesday, July 31, 2013 3:12 PM


SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Yesterday @ 6:43 AM
Points: 459, Visits: 426
Thanks for your reply! I am always looking to learn more.
Post #1479712
Posted Wednesday, July 31, 2013 3:27 PM


Mr or Mrs. 500

Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500

Group: General Forum Members
Last Login: Monday, July 28, 2014 4:49 AM
Points: 556, Visits: 2,581
TheSQLGuru (7/30/2013)
Fortunately for you SQL 2012 has much better Windowing Function support. Look into the PERCENTILE_CONT function. I also HIGHLY recommend you purchase Itzik Ben-Gan's SQL Server 2012 High-Performance TSQL Using Window Functions book.


Just an interesting FYI...

I have never used PERCENTILE_CONT but I do have Server 2012 High-Performance TSQL Using Window Functions. With the same sample data from earlier (#T), this is a modified version of Ben-Gan's 2012-based solution using PERCENTILE_COUNT:

WITH t(r,m) AS
( SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)),
PERCENTILE_CONT(0.5) WITHIN GROUP(ORDER BY i) OVER(PARTITION BY NULL)
FROM #T
)
SELECT m AS median
FROM t
WHERE r=1;

Though it is much simpler, Ben-Gan's 2012-PERCENTILE_CONT solution is about twice as slow as any other solution posted thus far. Again, just an FYI.


-- Alan Burstein



Read this article for best practices on asking questions.
Need to split a string? Try this (Jeff Moden)
Need a pattern-based string spitter? Try this (Dwain Camps)
My blog
Post #1479718
Posted Wednesday, July 31, 2013 3:38 PM


SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Yesterday @ 6:43 AM
Points: 459, Visits: 426
Thanks for the example!
Post #1479724
Posted Thursday, August 1, 2013 1:16 AM


Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Tuesday, November 26, 2013 8:40 AM
Points: 316, Visits: 908
The original query can be formulated a little bit more elegantly as this:

;WITH t(i,n,c) AS
( SELECT i,
ROW_NUMBER() OVER (ORDER BY i),
COUNT(*) OVER (PARTITION BY (SELECT NULL))
FROM #T)
SELECT
AVG(i) AS Median
FROM t
WHERE 2*n-c BETWEEN 0 AND 2;

This version should be slightly faster because of the simpler expression in the WHERE clause

/SG
Post #1479811
Posted Thursday, August 1, 2013 1:34 AM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Yesterday @ 3:55 PM
Points: 13,357, Visits: 10,222
TheSQLGuru (7/30/2013)
Fortunately for you SQL 2012 has much better Windowing Function support. Look into the PERCENTILE_CONT function. I also HIGHLY recommend you purchase Itzik Ben-Gan's SQL Server 2012 High-Performance TSQL Using Window Functions book.


+1 great book!




How to post forum questions.
Need an answer? No, you need a question.
What’s the deal with Excel & SSIS?

Member of LinkedIn. My blog at LessThanDot.

MCSA SQL Server 2012 - MCSE Business Intelligence
Post #1479818
« Prev Topic | Next Topic »

Add to briefcase 12»»

Permissions Expand / Collapse