COUNT() Function

  • Hugo Kornelis (4/16/2013)


    Good question, mediocre explanation - no errors, but complicating things more than needed.

    There are three variations of COUNT:

    * COUNT(*) - returns the number of rows, regardless of content.

    * COUNT(expression) - evaluates expression for each row, and counts returns the number of non-NULL results. Both COUNT(columnname) and COUNT(1) are commonly used versions of this variation. The last one is too - it uses a complex expression that always returns NULL; COUNT returns the number of non-NULL results, which is 0.

    * COUNT(DISTINCT expression) - as the previous one, but only the number of distinct values is returned. So if the same value is returned multiple times, it will be counted as 1. Rarely used in practise. The version in this question that used a constant expression is legal syntax, but will never be used in practise, as it will by definition return 1, except if the source of the query has no rows.

    Thanks for the explanation Hugo.

    I have a small doubt on

    Both COUNT(columnname) and COUNT(1) are commonly used versions of this variation.

    COUNT((columnname) and COUNT(1) returns the number of non-NULL results ??

    create table #temp (SomeText varchar(20))

    insert into #temp

    select 'SomeText1'

    union all

    select NULL

    union all

    select 'SomeText2'

    union all

    select 'SomeText3'

    union all

    select 'SomeText3'

    select COUNT(*) from #temp

    select COUNT(1) from #temp

    select COUNT(SomeText) from #temp

    drop table #temp

    The above code returns 5, 5, 4. Based on the output it looks like COUNT(*) and COUNT(1) returns the number of rows, regardless of content.

    [font="Verdana"]Regards,
    Rals
    [/font].
  • Hugo Kornelis (4/16/2013)


    Good question, mediocre explanation - no errors, but complicating things more than needed.

    There are three variations of COUNT:

    * COUNT(*) - returns the number of rows, regardless of content.

    * COUNT(expression) - evaluates expression for each row, and counts returns the number of non-NULL results. Both COUNT(columnname) and COUNT(1) are commonly used versions of this variation. The last one is too - it uses a complex expression that always returns NULL; COUNT returns the number of non-NULL results, which is 0.

    * COUNT(DISTINCT expression) - as the previous one, but only the number of distinct values is returned. So if the same value is returned multiple times, it will be counted as 1. Rarely used in practise. The version in this question that used a constant expression is legal syntax, but will never be used in practise, as it will by definition return 1, except if the source of the query has no rows.

    Great question and great explanation Hugo. We can always COUNT() on you. :hehe:



    Everything is awesome!

  • kapil_kk (4/15/2013)


    For the last query:

    select COUNT(convert(int,NULL)) from #temp

    It will return 0 because when you execute the query:

    SELECT CONVERT(int,NULL) it will return NULL and as the return type of COUNT is INT so NULL will implicitly

    converted to '0'.

    Hope it will clear to you now Yogi πŸ™‚

    This isn't quite right Kapil. It is not an implicit conversion. What happens is you count NULL which will be zero. It doesn't actually matter what datatype you use, it just needs to have a datatype. It could just as easily be varchar or uniqueidentifier. The point is that you are casting NULL as a datatype and counting the number of NULL which we know will be ignored in a COUNT so the result is 0.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • rals (4/16/2013)


    (...)

    The above code returns 5, 5, 4. Based on the output it looks like COUNT(*) and COUNT(1) returns the number of rows, regardless of content.

    COUNT(*) returns the number of rows (after evaluating WHERE and GROUP BY if they are in the query) - by definition.

    COUNT(1) is a specific use of the generic form COUNT(expression). The expression here is a very simple one - the constant 1. The definition of COUNT(expression) is to evaluate expression for each row - that will result in the value 1 for each row - and then count the number of non-NULL results. Since 1 is never NULL, the result of COUNT(1) will indeed always be equal to COUNT(*), allthough the "official" way to get there is a lot more complicated.

    (And before you ask - the optimizer is smart enough to know that COUNT(1), just as COUNT(0) or COUNT('T'), or even COUNT(non-nullable column), are all equal to COUNT(*) and will use the same code path for all these versions).

    And for the sake of completeness - the use of COUNT(1) instead of COUNT(*) to count the number of rows is very common. Long ago (I think it was in the days of SQL Server 6.0, maybe even before that), the parser used to first expand the * in COUNT(*) to a full column list, which would then be still ignored. Using COUNT(1) was more efficient because it avoided the hit on the metadata tables to get that columnlist. That has long since changed - but the myth lives on, and many people still use COUNT(1), claiming it to be better then COUNT(*).


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Great explanation Hugo. Thanks.

    Even I always use COUNT(1) though few times I heard both COUNT(*) and COUNT(1) are same. But today I more clear and thanks for you.

    πŸ™‚

    [font="Verdana"]Regards,
    Rals
    [/font].
  • Thanks for a nice one, Yogesh!

  • Thanks for the question

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • I thought the original poster's explanation was great: I found it very simple to understand and I liked the way each count statement was addressed individually.

  • Rose Bud (4/16/2013)


    I thought the original poster's explanation was great: I found it very simple to understand and I liked the way each count statement was addressed individually.

    The problem with the original explanation is this:

    COUNT(convert(int,NULL)) - It will give count as 0 always. Don't know why it is giving 0 count however COUNT() function counts NULL values also.

    The problem is that convert(int,NULL) is not a row but a column value, and the count function - although it counts rows all of whose values are null - doesn't count null values when it's counting the number of non-null occurences of a column. There is a big difference between count(*) )or count(<non-null constant>) from <something> and count(X) from <something> - the first is counting rows and doesn't care about nulls - a row is a row even if it contains nothing but nulls, the second doesn't see any nulls anyway, while the third (which is what was in question here, in this particular case) is counting non-null values.

    Tom

  • Thanks for the great question!

    If you want retunr 5? here you go,

    select COUNT(@@ERROR) FROM #temp

    select COUNT(0) from #temp

    select COUNT(100000000000000000000000000000) from #temp

    select COUNT('ZZZZZZZZZZZZZZZZZZZZZZZZZZZZ') from #temp

    select COUNT(-5.0) from #temp

    select COUNT(+5.0) from #temp

    select COUNT('') from #temp

    select COUNT($10.0) from #temp

    select COUNT(\.00) from #temp

    --------------------------------------
    ;-)β€œEverything has beauty, but not everyone sees it.” ― Confucius

  • Hugo Kornelis (4/16/2013)


    And for the sake of completeness - the use of COUNT(1) instead of COUNT(*) to count the number of rows is very common. Long ago (I think it was in the days of SQL Server 6.0, maybe even before that), the parser used to first expand the * in COUNT(*) to a full column list, which would then be still ignored. Using COUNT(1) was more efficient because it avoided the hit on the metadata tables to get that columnlist. That has long since changed - but the myth lives on, and many people still use COUNT(1), claiming it to be better then COUNT(*).

    +1

    You always seem to turn even the simplest of these Qotds to something so insightful that it forces me to read the discussions on every qotd...

    You are awesome..!!!

    ___________________________________________________________________
    If I can answer a question then anyone can answer it..trying to reverse the logic.. :hehe:

  • Mike Hays (4/16/2013)


    Interesting, learned something, thanks for the good question....

    Same Here!

    Not all gray hairs are Dinosaurs!

  • Sean Lange (4/16/2013)


    kapil_kk (4/15/2013)


    For the last query:

    select COUNT(convert(int,NULL)) from #temp

    It will return 0 because when you execute the query:

    SELECT CONVERT(int,NULL) it will return NULL and as the return type of COUNT is INT so NULL will implicitly

    converted to '0'.

    Hope it will clear to you now Yogi πŸ™‚

    This isn't quite right Kapil. It is not an implicit conversion. What happens is you count NULL which will be zero. It doesn't actually matter what datatype you use, it just needs to have a datatype. It could just as easily be varchar or uniqueidentifier. The point is that you are casting NULL as a datatype and counting the number of NULL which we know will be ignored in a COUNT so the result is 0.

    Thanks for correcting me Sean

    _______________________________________________________________
    To get quick answer follow this link:
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

  • Hugo Kornelis (4/16/2013)


    rals (4/16/2013)


    (...)

    The above code returns 5, 5, 4. Based on the output it looks like COUNT(*) and COUNT(1) returns the number of rows, regardless of content.

    COUNT(*) returns the number of rows (after evaluating WHERE and GROUP BY if they are in the query) - by definition.

    COUNT(1) is a specific use of the generic form COUNT(expression). The expression here is a very simple one - the constant 1. The definition of COUNT(expression) is to evaluate expression for each row - that will result in the value 1 for each row - and then count the number of non-NULL results. Since 1 is never NULL, the result of COUNT(1) will indeed always be equal to COUNT(*), allthough the "official" way to get there is a lot more complicated.

    (And before you ask - the optimizer is smart enough to know that COUNT(1), just as COUNT(0) or COUNT('T'), or even COUNT(non-nullable column), are all equal to COUNT(*) and will use the same code path for all these versions).

    And for the sake of completeness - the use of COUNT(1) instead of COUNT(*) to count the number of rows is very common. Long ago (I think it was in the days of SQL Server 6.0, maybe even before that), the parser used to first expand the * in COUNT(*) to a full column list, which would then be still ignored. Using COUNT(1) was more efficient because it avoided the hit on the metadata tables to get that columnlist. That has long since changed - but the myth lives on, and many people still use COUNT(1), claiming it to be better then COUNT(*).

    I was not aware of this things..

    Thanks a lot Hugo for telling new things πŸ™‚

    _______________________________________________________________
    To get quick answer follow this link:
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

  • Thanks for the question. In general, I don't favor questions the require 57 answers to be evaluated. However, this one was OK as the code involved wasn't too complex.

    [font="Verdana"]Please don't go. The drones need you. They look up to you.[/font]
    Connect to me on LinkedIn

Viewing 15 posts - 16 through 30 (of 46 total)

You must be logged in to reply to this topic. Login to reply