August 13, 2015 at 3:22 am
Hello,
/First post
I'm new to SQL and i'm in a spot of bother trying to work out how to do something!
I have a database with 1million+ records in and i'm trying to collect the median values of column(2) for all distinct values in column (1)
Example DB:
Column 1 Column 2
978555 500
978555 502
978555 480
978555 490
978324 1111
978324 1102
978311 122
978311 120
978994 804
978320 359
and I need it to display on SELECT as
column 1 column 2
978555 495
978324 1106
978311 121
978994 804
978320 359
Is this possible on 2008 R2 and could anyone help me with this please? thank you in advance!
August 13, 2015 at 4:28 am
djones 23745 (8/13/2015)
Hello,/First post
I'm new to SQL and i'm in a spot of bother trying to work out how to do something!
I have a database with 1million+ records in and i'm trying to collect the median values of column(2) for all distinct values in column (1)
Example DB:
Column 1 Column 2
978555 500
978555 502
978555 480
978555 490
978324 1111
978324 1102
978311 122
978311 120
978994 804
978320 359
and I need it to display on SELECT as
column 1 column 2
978555 495
978324 1106
978311 121
978994 804
978320 359
Is this possible on 2008 R2 and could anyone help me with this please? thank you in advance!
this may help
how did you calculate the following?
column 1 column 2
978555 495
________________________________________________________________
you can lead a user to data....but you cannot make them think
and remember....every day is a school day
August 13, 2015 at 5:16 am
This article is worth a read too.
That's a strange median, as JLS points out.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
August 13, 2015 at 7:13 am
Dwain's article is a great resource. Hopefully, it will help the OP.
Could you explain why do you say that the median is wrong? It's the average of the 2 middle rows. (490+500)/2=495
August 13, 2015 at 7:22 am
Thank you for the reply! I shall have a look now, looks good!
My example DB was just a quick, off the top of my head example with ballpark medians which is why it is wrong.
Thanks!
August 13, 2015 at 10:37 am
hi
the below code may helpful
select distinct column1,AVG(column2) maridain
from table
group by column1
August 13, 2015 at 10:42 am
The DISTINCT keyword is redundant as you're using GROUP BY. You're also giving average which is the mean and not the median.
August 14, 2015 at 2:01 pm
Has anyone considered something like this?
DECLARE @TEST AS TABLE (
Column1 int,
Column2 int
);
INSERT INTO @TEST (Column1, Column2)
VALUES (978555, 500),
(978555, 502),
(978555, 480),
(978555, 490),
(978324, 1111),
(978324, 1102),
(978311, 122),
(978311, 120),
(978994, 804),
(978320, 359);
WITH RAW_DATA AS (
SELECT T.Column1, T.Column2,
COUNT(T.Column2) OVER(PARTITION BY T.Column1) AS THE_COUNT,
ROW_NUMBER() OVER(PARTITION BY T.Column1 ORDER BY T.Column2) AS RN,
CASE
WHEN COUNT(T.Column2) OVER(PARTITION BY T.Column1) % 2 = 0
AND ROW_NUMBER() OVER(PARTITION BY T.Column1 ORDER BY T.Column2) IN
(COUNT(T.Column2) OVER(PARTITION BY T.Column1) / 2,
(COUNT(T.Column2) OVER(PARTITION BY T.Column1) / 2) + 1)
THEN 1
WHEN COUNT(T.Column2) OVER(PARTITION BY T.Column1) % 2 = 1
AND ROW_NUMBER() OVER(PARTITION BY T.Column1 ORDER BY T.Column2) =
COUNT(T.Column2) OVER(PARTITION BY T.Column1) / 2 + 1
THEN 1
ELSE 0
END AS IS_MIDDLE
FROM @TEST AS T
)
SELECT R.Column1, AVG(R.Column2) AS MEDIAN
FROM RAW_DATA AS R
WHERE R.IS_MIDDLE = 1
GROUP BY R.Column1
ORDER BY R.Column1
Results:
Column1MEDIAN
978311121
978320359
9783241106
978555495
978994804
Steve (aka sgmunson)
Rent Servers for Income (picks and shovels strategy)
Viewing 8 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply
This website stores cookies on your computer.
These cookies are used to improve your website experience and provide more personalized services to you, both on this website and through other media.
To find out more about the cookies we use, see our Privacy Policy