Standard Deviation

Question

Post reply

Standard Deviation

jkelly

Right there with Babe

Points: 765
More actions
January 27, 2011 at 9:42 pm

#246019

Comments posted to this topic are about the item Standard Deviation

Viewing 15 posts - 1 through 15 (of 28 total)

You must be logged in to reply to this topic. Login to reply

Koen Verbeeck SSC Guru Points: 259085 More actions · Answer 1

Nice question, really learned something.

The BOL page doesn't even mention this subject.

Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP

sharath.chalamgari SSCertifiable Points: 5680 More actions · Answer 2

I think the Explanation is bit confusing.

if you give any numeric value to the function without any from clause it will give NULL.

SELECT STDEV(100)

"The STDEV function uses all of the values to calculate the standard deviation. There must be at least two values in the field or the standard deviation will not be calculated and a NULL is returned."

example in my PackageTable the standard deveation of pa_rates column can find using

select stdev(pa_Rates) From Package

for the above Query i will get some result

but if i cange the Query with some where condition,where i will get only one Row (sample) then the stdevp will results the null value

select stdev(pa_Rates) From Package where pa_PackageID = 1

Koen Verbeeck SSC Guru Points: 259085 More actions · Answer 3

sharath.chalamgari (1/28/2011)
but if i cange the Query with some where condition,where i will get only one Row (sample) then the stdevp will results the null value
select stdev(pa_Rates) From Package where pa_PackageID = 1

The WHERE clause is executed first, so you are still calculating the variance over 1 row, hence the NULL.

Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP

sharath.chalamgari SSCertifiable Points: 5680 More actions · Answer 4

Koen (da-zero) (1/28/2011)
sharath.chalamgari (1/28/2011)
but if i cange the Query with some where condition,where i will get only one Row (sample) then the stdevp will results the null value
select stdev(pa_Rates) From Package where pa_PackageID = 1
The WHERE clause is executed first, so you are still calculating the variance over 1 row, hence the NULL.

Ya that's what i have explained above.

Koen Verbeeck SSC Guru Points: 259085 More actions · Answer 5

sharath.chalamgari (1/28/2011)
Koen (da-zero) (1/28/2011)
sharath.chalamgari (1/28/2011)
but if i cange the Query with some where condition,where i will get only one Row (sample) then the stdevp will results the null value
select stdev(pa_Rates) From Package where pa_PackageID = 1
The WHERE clause is executed first, so you are still calculating the variance over 1 row, hence the NULL.
Ya that's what i have explained above.

Then why is the explanation confusing? I find it pretty clear...

Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP

M&M SSC-Insane Points: 21700 More actions · Answer 6

Thanks for introducing new topic. I had not heard about it before

M&M

mdv 9731 SSC Enthusiast Points: 171 More actions · Answer 7

I do not agree that the answer is right.

Think it is supposed to be 0.

the standard deviation s (sigma) is the square root of the average value of (X - µ)2.

Which means s = sqrt(((1-1)^2)/1) = 0

Or in other words std. deviation equals sqrt of (population-avarage) sq / number of population values

Please correct me if im wrong on the formula it has been a while since i used my statistics 😀

Koen Verbeeck SSC Guru Points: 259085 More actions · Answer 8

mdv 9731 (1/28/2011)
I do not agree that the answer is right.
Think it is supposed to be 0.
the standard deviation s (sigma) is the square root of the average value of (X - µ)2.
Which means s = sqrt(((1-1)^2)/1) = 0
Or in other words std. deviation equals sqrt of (population-avarage) sq / number of population values
Please correct me if im wrong on the formula it has been a while since i used my statistics 😀

Even if your statistics say it is 0, SQL Server returns NULL, so the answer is still correct.

You can verify it by running the query in SQL Server Management Studio.

Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP

mdv 9731 SSC Enthusiast Points: 171 More actions · Answer 9

I'm not really going to comment on that answer.

Just happy you are not creating any kind of BI og reporting in our company 🙂

I might be wrong but i still question the result.

Even if i know that it doesnt make much practical sence to talk deviation on 1 value.

Toreador SSChampion Points: 11358 More actions · Answer 10

Koen (da-zero) (1/28/2011)
Then why is the explanation confusing? I find it pretty clear...

The explanation is "The denominator of the variance for a sample population is n - 1, resulting in division by zero."

This suggests it should give an error, and doesn't explain why it results in null instead.

I guessed wrong, but I'm not sure what the question was supposed to prove, other than if you apply a function wrongly then you get a meaningless answer!

Koen Verbeeck SSC Guru Points: 259085 More actions · Answer 11

mdv 9731 (1/28/2011)
I'm not really going to comment on that answer.
Just happy you are not creating any kind of BI og reporting in our company 🙂
I might be wrong but i still question the result.
Even if i know that it doesnt make much practical sence to talk deviation on 1 value.

The question was about the result of the query (ergo: what does SQL Server do with it).

Not what is the theoretical result according to statistics. (however, the explanation might be lacking. I'm (luckily) not a statistician)

But don't worry, if I'll do BI or reporting in your company, I will read the requirements thoroughly 😉

Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP

sharath.chalamgari SSCertifiable Points: 5680 More actions · Answer 12

Koen (da-zero) (1/28/2011)
Then why is the explanation confusing? I find it pretty clear...

It is Confusing because in his Query the parameter 1 does not mean that Number of samples.if you put 100 inplace of 1 still it gives the same null. as per his explanation it is not a divide by zero error

The usage of the Function is wrong in the Query.

see if we can use some thing like below it gives 0

SELECT STDEV(1) from sys.objects

Koen Verbeeck SSC Guru Points: 259085 More actions · Answer 13

sharath.chalamgari (1/28/2011)
Koen (da-zero) (1/28/2011)
Then why is the explanation confusing? I find it pretty clear...
It is Confusing because in his Query the parameter 1 does not mean that Number of samples.if you put 100 inplace of 1 still it gives the same null. as per his explanation it is not a divide by zero error
The usage of the Function is wrong in the Query.
see if we can use some thing like below it gives 0
SELECT STDEV(1) from sys.objects

Allright, that makes sense. Thanks for the explanation.

Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP

mdv 9731 SSC Enthusiast Points: 171 More actions · Answer 14

That makes sence.

So its really a matter of SQL Server returning NULL instead of #ERROR not having the correct syntax which obviously means that a FROM clause needs to be included.

So the point proven is more like SQL Server returning a 'value' instead of an error.

And not that it can't calculate 🙂 (and is has nothing to do with a zero division)

'cause my point was that std. dev. is 0 if calculated on a single value no matter what it is (even if that makes no sence)

Question good, formula good, Microsoft needs to work on documentation

We're all happy

Cheers