Using a column name in a COUNT function

  • I think the count(1) myth started some 20 years ago when at some point there was an old version of Oracle that actually performed better with COUNT(1), but all modern servers are optimized for count(*) and if you enter count(1) the parser automatically translates it internally to count(*). Anyone interested can google for myth count(1) to see more info on the subject. As we have said before SQL will allow a programmer to code it anyway they want or like, but the universally accepted standard is count(*)

  • Sure the product has gotten better almost version by version. The COUNT function is still a function on a result set rather than a table function, right?

    IF I were looking for the number of rows in a table without regard to the contents of the rows I would not use COUNT at all. I would look in [sys].[sysindexes].

    ATBCharles Kincaid

  • Charles Kincaid (2/4/2015)


    IF I were looking for the number of rows in a table without regard to the contents of the rows I would not use COUNT at all. I would look in [sys].[sysindexes].

    That used to be an approximate in older versions of SQL Server.

    In the current versions it is "correct", for some definition of correct. The problem with this is that it does not respect transactional consistency. With a bit of effort, I can get [sys].[sysindexs] to report 1000 rows in a table that is constrained to never contain more than 100 rows.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • I think the count in sysindex is not guaranteed to be correct. So you can't depend on it 100%, but it is okay for an estimated number. COUNT(*) is more typically used to count the rows in a group. Like:

    SELCT CustomerName, SUM(OrderAmount), COUNT(*)

    FROM Order

    GROUP BY CustomerName

    In general when there are two or more options I try to look for some reasons to pick one over the other. I was thinking that perhaps it is easier to type COUNT(1), but it is actually one less keystroke to type COUNT(*). I am the worlds slowest typist but try yourself, which one can you type in faster? But I have not really seen any reasons why someone should use COUNT(1) instead of COUNT(*) other than preference. And preference is not a valid reason. Performance would be a good reason, but not so...

  • SQLMaister (2/4/2015)


    I am the worlds slowest typist ...

    I may have to challenge you to a race then. You can put your ten fingers against my 8. Speaking of 8 on my keyboard the asterisk is the shift of the 8. Oh, there is one that does not require the shift. That key way over there.

    You are right about sysindexes not being updated in the face of pending transactions and the like. Most of those times I was more concerned about getting a quick result rather than an absolute count. For my result sets I want the count to be spot on and can usually take any performance hit. That is because I'm asking for the count of what the server has already fetched.

    ATBCharles Kincaid

  • I just went and took a look at https://msdn.microsoft.com/en-us/library/ms190283.aspx Right off the bat there is:

    XML indexes are not supported in this view. Partitioned tables and indexes are not fully supported in this view; use the sys.indexes catalog view instead.

    But that view has no row count at all. 🙁

    ATBCharles Kincaid

  • count(*) requires 9 key strokes

    count(1) requires 10 key strokes

    ...and we don't want to wear out the shift key, also consider that the *() line up next to each other vs the 1 being separated.

    Human time is the most expensive component in this equation so count(*) is the winner.

    If someone insists we must put an integer between the parenthesis, then it would be more efficient to use 9 or 0, which is placed closer on the keyboard.

    count(9) or count(0) wouldn't be so bad... Actually it would be... just use count(*)

    After all these discussions, did anyone come up with a reasonable explanation why they ever wanted to use count(1) instead of count(*) ? Other than the myth that count(1) is faster?

  • Sean Lange (2/4/2015)


    andycadley (2/4/2015)


    dmcquade (2/3/2015)


    Is there a downside to using the IsNULL function? For example, using the COUNT script when I try:

    SELECT COUNT(IsNULL([Name], ''))

    I get 8 as the result

    It'll probably be slower than COUNT(*), unless the optimizer manages to spot what you're trying to do. Not worth trying it though, COUNT(*) will always be the fastest way because SQL Server has an explicit optimization to ensure so.

    It would be a little slower but count(*) is no faster than count(1) or count(any other nonNULL value). There is no internal optimization for count(*) versus counting other values. It just counts the rows.

    That being said, COUNT(IsNULL([Name], '') is rather silly. Why bother with the column here when you are going to end up counting rows. Just use count(*) instead.

    Sorry, I obviously wasn't clear. For COUNT(*) and COUNT(<constant>) the query optimizer knows exactly what you're trying to do and always chooses an optimal plan, so you'll never beat that. You might improve it with indexing and such on Views, but it'll still be the fastest way.

    When you start putting convoluted logic, like the IsNull example, you increase the risk that the query optimizer won't spot it and will go off and actually generate the data. So it's not only less readable but also more likely to give you issues. If you want to count rows just use COUNT(*) or COUNT(1) and be done with it.

  • Right you are. That asterisk is easier as you don't have to release the shift key. Also since the newer versions of SQL Server have this cool feature in the query optimizer that pretty much renders this more a matter of style then.

    "A difference that makes no difference is no difference."

    ATBCharles Kincaid

  • Totally is a downside to using count(func(column)) if column doesn't doesn't have an index. And, isnull() doesn't do any magic there, just replacing the nulls that wouldn't get counted, for a null able column. Otherwise you could use select count(null), instead of select count(1). But, excuse pun, nulls don't count. Although, they do OK in where exist(select null from ..), but then again so does 1/0 in that particular case.

    The thing with count(*) is that it has * in it, as does select *. So if you search for the latter by finding * in your code - you get a load of false positives. But, if your coding standard specifies count(1), you should only go "Oh crap, there's a select * in here somewhere" - when there actually is one.

    This DBA says - "It depends".

  • There is nothing wrong with having *'s in your code. That is the most perfect SQL standard code as intended by the scientists who developed SQL. For example COUNT(*) and WHERE EXISTS (SELECT * FROM XYZ where XYZ.ID = 123456) are the most correct way to code. Thus there is nothing wrong with including a * in your code. Further more the standard comments are /* comment here */ so you have *'s there. However you should NEVER code SELECT * FROM... in a result set (or as part of an insert statement or similar), that is an extremely bad coding habit. But never put COUNT(1) or (SELECT 1 in the examples above as some people suggest, it is non standard SQL and looks unprofessional. It works, I know, but just because it works doesn't make it correct and elegant...

  • SQLMaister (2/5/2015)


    There is nothing wrong with having *'s in your code. That is the most perfect SQL standard code as intended by the scientists who developed SQL. For example COUNT(*) and WHERE EXISTS (SELECT * FROM XYZ where XYZ.ID = 123456) are the most correct way to code. Thus there is nothing wrong with including a * in your code. Further more the standard comments are /* comment here */ so you have *'s there. However you should NEVER code SELECT * FROM... in a result set (or as part of an insert statement or similar), that is an extremely bad coding habit. But never put COUNT(1) or (SELECT 1 in the examples above as some people suggest, it is non standard SQL and looks unprofessional. It works, I know, but just because it works doesn't make it correct and elegant...

    +10000

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Not in itself a very elegant argument. But, here goes - "Elegant - schmelegant". Or to re-interpret my favourite quote, "Everything as elegant as possible, but no more elegant than necessary".

    Of side-on relevance. Your code may be elegant, but how long have you spent deep down amongst your data - because believe me data is never that elegant.

    No doubt your from the "Beauty is truth, blah, blah .." camp. And, you always end up with code that has no bugs. Yeah right.

    This DBA says - "It depends".

  • Nothing is perfect, but that doesn't mean we shouldn't try to do our best. It doesn't make sense to me that a developer would say, "I don't want to follow standards, because I like to code it my own way." I still have not seen any argument, why NOT to use COUNT(*), or why to use COUNT(1). It seems like there should at least be some rational reasoning behind the decision. It sounds like a major case of "I have always done it that way"... It happens all the time that we code a certain way, but when we do find better ways, we shouldn't be afraid to change.

  • SQLMaister (2/6/2015)


    Nothing is perfect, but that doesn't mean we shouldn't try to do our best. It doesn't make sense to me that a developer would say, "I don't want to follow standards, because I like to code it my own way." I still have not seen any argument, why NOT to use COUNT(*), or why to use COUNT(1). It seems like there should at least be some rational reasoning behind the decision. It sounds like a major case of "I have always done it that way"... It happens all the time that we code a certain way, but when we do find better ways, we shouldn't be afraid to change.

    I think the argument here isn't "why NOT to use COUNT(*)" or "why to use COUNT(1)". It's more along the lines of "use either, the result and the query execution will be the same."

    EDIT: I am referring specifically to these two versions and not to the use of a column name inside the COUNT().

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

Viewing 15 posts - 91 through 105 (of 113 total)

You must be logged in to reply to this topic. Login to reply