NULL Aggregates

  • Mighty (1/7/2015)


    I think 1 + 2 + 3 + NULL is the same as 1 + 2 + 3 + 0.

    In relation to this, I don't fully agree. For me 1 + 2 + 3 + NULL should return NULL, the same as that 'string1' + 'string2' + 'string3' + NULL returns NULL.

    This is a bit funny in SQL.

    1+2+3+NULL = NULL, but

    1

    2

    3

    + NULL

    ------

    6


    Just because you're right doesn't mean everybody else is wrong.

  • Xavon (1/7/2015)


    Ed Wagner (1/7/2015)


    Koen Verbeeck (1/7/2015)


    Ed Wagner (1/7/2015)


    Koen Verbeeck (1/7/2015)


    Xavon (1/7/2015)


    I wonder how MS is defining 'ignore'; because I would argue that if SUM ignored NULLs, I would not have to use half as many ISNULL(<column>, 0) statements.

    I think 1 + 2 + 3 + NULL is the same as 1 + 2 + 3 + 0.

    But they aren't the same. The NULLs aren't included, so 1 + 2 + 3 + NULL would be the same as 1 + 2 + 3.

    The result is the same. Both times 6.

    My point was that if you replace NULL with 0 or not, it doesn't matter when using SUM.

    (and 1 + 2 + 3 is the same as 1 + 2 + 3 + 0 :-P)

    The point is correct, so the ISNULL(column, 0) isn't necessary unless MS plans to revise the SUM function to require it. πŸ˜‰

    The real lesson is don't comment before your coffee, or your brain might decide that SUM is the same as arithmetic addition...

    Can you give an example to show how SUM is different than arithmetic addition and also how using isnull(<column>,0) with SUM gives you different results than just ignoring the nulls?

  • GROUPING was a bad choice for the question, so I'll award points back. My apologies for that. I've changed it.

    Regarding COUNT, the * doesn't matter for NULLs as you're comparing the expression (COUNT(*)) to the data, which may have a NULL. To me, that's misplaced thinking.

    Whether or not COUNT ignores NULLs depends on the expression, without a doubt. However that means that COUNT can ignore NULLs. I hesitated to use "can" since that usually causes all sorts of interpretation questions. In my mind, if you look at these, then you can't say that COUNT is always COUNT(col). It can be COUNT(*), which is often used. Therefore that's the answer (assuming GROUPING is removed).

    As far as SUM(1+2+3+NULL), that's a separate debate. It's 6, though for SQL Server.

  • I think the use of the term "ignores" is not semantically correct (even if it used by BOL). "Excludes" is more accurate.

    1 + 2 + 3 + NULL is NULL, absolutely. NULL is not replaceable by zero. Any set with a null value has a null sum, since null is "unknown" and therefore is free to range from the minimum to the maximum values of the domain. I have one full box of eggs, and another with an unknown number of eggs. How many eggs do I have?

    Don Simpson



    I'm not sure about Heisenberg.

  • DonlSimpson (1/7/2015)


    1 + 2 + 3 + NULL is NULL, absolutely. NULL is not replaceable by zero. Any set with a null value has a null sum, since null is "unknown" and therefore is free to range from the minimum to the maximum values of the domain. I have one full box of eggs, and another with an unknown number of eggs. How many eggs do I have?

    Nope.

    SELECT SUM(N)

    FROM

    (

    SELECT N = 1

    UNION ALL

    SELECT 2

    UNION ALL

    SELECT 3

    UNION ALL

    SELECT NULL

    ) tmp;

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Got the answer by deduction and removing the obvious wrongs ones, but I do not like the question.

    COUNT, as any other aggregate, ignores NULL values.

    COUNT(*) is special. The * means that it doesn't count values, but it count rows. A row can not be NULL (because a NULL is a scalar function, not a row). So for COUNT(*), it is hard to say whether NULLs are ignored or not, because there can be no NULLs in its input set.

    To say that COUNT does not ignore NULL values, as the question implies, is flat out wrong.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Koen Verbeeck (1/7/2015)


    DonlSimpson (1/7/2015)


    1 + 2 + 3 + NULL is NULL, absolutely. NULL is not replaceable by zero. Any set with a null value has a null sum, since null is "unknown" and therefore is free to range from the minimum to the maximum values of the domain. I have one full box of eggs, and another with an unknown number of eggs. How many eggs do I have?

    Nope.

    SELECT SUM(N)

    FROM

    (

    SELECT N = 1

    UNION ALL

    SELECT 2

    UNION ALL

    SELECT 3

    UNION ALL

    SELECT NULL

    ) tmp;

    I was not referring to the T-SQL implementation, but to the "logical" problem. The T-SQL SUM() function excludes nulls from its process for the convenience of developers, but the logical sum of the complete set is null.

    Don Simpson



    I'm not sure about Heisenberg.

  • Thank you for the post. Steve.

    Orig: "Which of these aggregate functions does not ignore NULL values?"

    Now: "Which of these aggregate functions does not exclude NULL values?"

    When I loaded the Qtod, GROUPING was listed and when I selected it and submitted it was changed to MAX. I was thinking COUNT and GROUPING, but as the option button allows only one choice (so, considering the fact of COUNT(*) and COUNT(cName) usage- kind of felt like a trick and then I chose GROUPING.

    Now I saw the complete conversation. 😎

    ww; Raghu
    --
    The first and the hardest SQL statement I have wrote- "select * from customers" - and I was happy and felt smart.

  • Hugo Kornelis (1/7/2015)


    Got the answer by deduction and removing the obvious wrongs ones, but I do not like the question.

    COUNT, as any other aggregate, ignores NULL values.

    COUNT(*) is special. The * means that it doesn't count values, but it count rows. A row can not be NULL (because a NULL is a scalar function, not a row). So for COUNT(*), it is hard to say whether NULLs are ignored or not, because there can be no NULLs in its input set.

    To say that COUNT does not ignore NULL values, as the question implies, is flat out wrong.

    Thanks Hugo, you saved me some typing, as I was about to complain about the "null rows" as well πŸ™‚

    Best Regards,

    Chris BΓΌttner

  • Hugo Kornelis (1/7/2015)


    Got the answer by deduction and removing the obvious wrongs ones, but I do not like the question.

    COUNT, as any other aggregate, ignores NULL values.

    COUNT(*) is special. The * means that it doesn't count values, but it count rows. A row can not be NULL (because a NULL is a scalar function, not a row). So for COUNT(*), it is hard to say whether NULLs are ignored or not, because there can be no NULLs in its input set.

    To say that COUNT does not ignore NULL values, as the question implies, is flat out wrong.

    Thanks for the nice explanation

  • When I use a Group By it includes a total of any NULL values as well as the field I am grouping by.

  • marcia.j.wilson (1/7/2015)


    Xavon (1/7/2015)


    Ed Wagner (1/7/2015)


    Koen Verbeeck (1/7/2015)


    Ed Wagner (1/7/2015)


    Koen Verbeeck (1/7/2015)


    Xavon (1/7/2015)


    I wonder how MS is defining 'ignore'; because I would argue that if SUM ignored NULLs, I would not have to use half as many ISNULL(<column>, 0) statements.

    I think 1 + 2 + 3 + NULL is the same as 1 + 2 + 3 + 0.

    But they aren't the same. The NULLs aren't included, so 1 + 2 + 3 + NULL would be the same as 1 + 2 + 3.

    The result is the same. Both times 6.

    My point was that if you replace NULL with 0 or not, it doesn't matter when using SUM.

    (and 1 + 2 + 3 is the same as 1 + 2 + 3 + 0 :-P)

    The point is correct, so the ISNULL(column, 0) isn't necessary unless MS plans to revise the SUM function to require it. πŸ˜‰

    The real lesson is don't comment before your coffee, or your brain might decide that SUM is the same as arithmetic addition...

    Can you give an example to show how SUM is different than arithmetic addition and also how using isnull(<column>,0) with SUM gives you different results than just ignoring the nulls?

    quick example of what I was thinking

    DECLARE @NullSum TABLE (ID INT, Val1 INT, Val2 INT)

    INSERT INTO @NullSum VALUES (1, 1, 2), (2, 3, 4), (3, 5, NULL)

    DECLARE @Loop INT, @Val1 INT, @Val2 INT

    SET @Val1 = 0

    SET @Val2 = 0

    SELECT @Loop = MIN(ID) FROM @NullSum

    WHILE @Loop <= (SELECT MAX(ID) FROM @NullSum)

    BEGIN

    SET @Val1 = @Val1 + (SELECT Val1 FROM @NullSum WHERE ID = @Loop)

    SET @Val2 = @Val2 + (SELECT Val2 FROM @NullSum WHERE ID = @Loop)

    SET @Loop = ISNULL((SELECT MIN(ID) FROM @NullSum WHERE ID > @Loop), @Loop + 2)

    END

    SELECT SUM(Val1) AS Val1Sum, @Val1 AS Val1Add, SUM(Val2) AS Val2Sum, @Val2 AS Val2Add FROM @NullSum

    SELECT Val1 + Val2 FROM @NullSum

    SELECT SUM(Val1), SUM(Val2) FROM @NullSum

    SELECT SUM(Val1) + SUM(Val2) FROM @NullSum

    SELECT SUM(Val1 + Val2) FROM @NullSum

    SELECT SUM(ISNULL(Val1, 0) + ISNULL(Val2, 0)) FROM @NullSum

  • Ditto

    The question asked if count would ignore values not ignore counting the row. I think the stated answer is wrong.

    Max, in effect, handles Nulls as a lower possible value. It will return Null if there is no non-Null value.

  • jbwa (1/8/2015)


    Ditto

    The question asked if count would ignore values not ignore counting the row. I think the stated answer is wrong.

    Max, in effect, handles Nulls as a lower possible value. It will return Null if there is no non-Null value.

    It will als return NULL if given an empty set:

    CREATE TABLE dbo.TestIt (a int);

    SELECT MAX(a) FROM dbo.TestIt;

    Nulls are ignored; both an empty set and a set with only null values result in an empty set being actually processed by MAX, hence the NULL return.

    SUM and AVG work the same, and I believe CHECKSUM_AGG does as well. COUNT will normally return 0 when the set fed into it has no data (either because the input set was empty, or because COUNT was asked to work on a nullable column and all input was ignored). I think I once saw a situation where COUNT could return NULL, but I do not remember the details, and I may be wrong.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • +1 thanks for posting this question.

    Andre Ranieri

Viewing 15 posts - 31 through 44 (of 44 total)

You must be logged in to reply to this topic. Login to reply