Using a column name in a COUNT function

  • gavansmith (10/15/2013)


    can through -> can throw ?

    '

    Yep. That was pointed out this morning. I'll fix it.

    ATBCharles Kincaid

  • Antares686 (10/15/2013)


    Charles Kincaid (10/15/2013)


    What I would actually like to see is COUNT() to mean to count the number of rows.

    I would like to only have to type COUNT() instead of COUNT(*) or COUNT(1) or anything to get a generic count. Sure it's 1 character, but why make us type something so trivial that could be handled without that.

    If that was the case how would we count distinct values or count nonNULL values?

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • SQLMaister (10/15/2013)


    There is no benefit in using select count(1), Select Count(columnname) excludes NULL values, that is the intended purpose. Select count(*) is the most efficient and correct way.

    Pretty broad brush strokes you are painting with. We have a long history of backing up these types of comments with proof.

    It is incorrect to say there is no benefit is using select count(1). I don't use that method but there is nothing wrong with it at all.

    Stating that count(*) is the most efficient with nothing to back up those claims doesn't hold much water. Put together a million row test table and demonstrate that it is faster along with those claims and then you have something.

    Saying that using count(*) is the correct way is just plain preference. You prefer that to count(1) so for you it is "correct". That is not true for everybody.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • COUNT(*) is absolutely the right way and COUNT(1) doesn't mean or do anything different other than cause confusion. If it was somehow better and yet returned the same results, do you really think that MS would not implement COUNT(*) in terms of COUNT(1) (or vice-versa).

    With regards to proof, show me a different query plan for each and I'll believe you. I have never, ever seen a different plan on the odd occasion I have had this argument before.

    To all those complaining about table scans and such, bear in mind that the optimiser will decide how best to implement the function and that will take into account any indexes that it can use as well. The only reason to to a table scan to determine a count is where there is no index on a non-nullable column (i.e. no PK). This is a problem with your database design and there is no way (even by hand) you could determine the count without scanning each row. SQL is declarative - ask for what you want in the standard language and let the database work it out. Generally speaking when you try to second guess what goes on under the hood with things like this you are not helping!

  • LoztInSpace (10/15/2013)


    COUNT(*) is absolutely the right way and COUNT(1) doesn't mean or do anything different other than cause confusion. If it was somehow better and yet returned the same results, do you really think that MS would not implement COUNT(*) in terms of COUNT(1) (or vice-versa).

    With regards to proof, show me a different query plan for each and I'll believe you. I have never, ever seen a different plan on the odd occasion I have had this argument before.

    To all those complaining about table scans and such, bear in mind that the optimiser will decide how best to implement the function and that will take into account any indexes that it can use as well. The only reason to to a table scan to determine a count is where there is no index on a non-nullable column (i.e. no PK). This is a problem with your database design and there is no way (even by hand) you could determine the count without scanning each row. SQL is declarative - ask for what you want in the standard language and let the database work it out. Generally speaking when you try to second guess what goes on under the hood with things like this you are not helping!

    Stating that one of two methods that does the same thing is the "correct way" is purely arrogance. I use count(*) but that is because I prefer that method. I don't however say that the other way is wrong. It causes you confusion to use select(1), that does NOT mean it causes confusion to everyone. There are people who are confused by count(*).

    We are absolutely in agreement from a performance standpoint but I don't have such a stance that my way is the best way and the other ways are wrong.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • There is no difference between COUNT(*) and COUNT(1). You will get the exact same execution plan with either.

  • Sean Lange (10/15/2013)


    Antares686 (10/15/2013)


    Charles Kincaid (10/15/2013)


    What I would actually like to see is COUNT() to mean to count the number of rows.

    I would like to only have to type COUNT() instead of COUNT(*) or COUNT(1) or anything to get a generic count. Sure it's 1 character, but why make us type something so trivial that could be handled without that.

    If that was the case how would we count distinct values or count nonNULL values?

    The idea to get the count of all records requiring any input is completely unnecessary. They could alter COUNT to accept 0 inputs to do the same thing as COUNT(*) or COUNT(1) and then if you want a count of distinct records on a column or non-null you put in a column name, so COUNT(COLUMN1).

  • Count(1) and count(*) are technically the same. They can be different as 1 is not a column and SQL can thus leave any value.

    Execution plan, reads everything will be the same.

    Amit Pandey

    SQL Server DBA

  • Charles Kincaid (10/15/2013)


    Brandie Tarvin (10/15/2013)


    Interesting results on the execution plan. All plans but the COUNT() on Name came backup with only an expected 9B in the row size. The Name count came back much bigger (and off of what the results were).

    That is part of the point of the article. COUNT([Column]) if the column is fixed width has to pull in the contents of the column. Experiment with columns of various widths(INT vs. BIGINT).

    Except your article really doesn't seem to mention anything about the execution plans or what is pulled in. Plus NVARCHAR(MAX) is not a fixed width column. So I'm not sure what you mean by that last comment.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • SQLMaister (10/15/2013)


    Select count(*) is the most efficient and correct way.

    Careful there. Absolute statements like this can get you into trouble in the DBA world.

    The "correct way" depends on what you are trying to get back from the results and varies from one environment to another.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • SQLMaister (10/15/2013)


    There is no benefit in using select count(1), Select Count(columnname) excludes NULL values, that is the intended purpose. Select count(*) is the most efficient and correct way.

    Welcome to the forum - I see that that was your first post. I agree that there's no benefit in using COUNT(1) over COUNT(*) as they're the same. Conversely, there's no benefit in using COUNT(*) over COUNT(1).

    My preference is to use COUNT(1). Similarly I'll use IF EXISTS(SELECT 1....etc.). If I work somewhere where all the code says COUNT(*) then I'll do that instead - it's not really a big deal.

  • Charles Kincaid (10/15/2013)


    Almost to the same level of performance boost that can be obtained by moving some predicated to the JOIN instead of leaving them in the WHERE (but that another subject).

    Could you expand on that, please?

  • marlon.seton (10/16/2013)


    Charles Kincaid (10/15/2013)


    Almost to the same level of performance boost that can be obtained by moving some predicated to the JOIN instead of leaving them in the WHERE (but that another subject).

    Could you expand on that, please?

    Yes, I could. However that is far from the subject of the article. I don't want to fork this discussion so give me a bit and I'll see what I can send you. Work calls.

    ATBCharles Kincaid

  • Muchas gracias.

  • From the article:

    You know that doing COUNT(*) on a table with a lot of columns and a lot of rows can take a lot of time and memory.

    Actually, no. I don't know that. Because it's not true.

    In both SELECT COUNT(*) ... and in EXISTS (SELECT * ...), the optiimizer knows that * represents rows, not individual columns. It will not expand a column list, it will not return all columns, and it will not fetch all columns. It will find the cheapest way to satisfy the query, which usually involves using the smallest available nonclustered index.

    Charles Kincaid (10/15/2013)


    With newer versions of SQL Server (2008 and 2008r2) the good folks on the development team seem to have done something fun with COUNT(*). It tends to short-circuit and do the same as COUNT(1).

    That goes back a bit further than SQL 2008 / R2. COUNT(*) was just as efficient as COUNT(1) in SQL Server 7.0, maybe even in 6.5. Not sure on 4.2 or older versions.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

Viewing 15 posts - 31 through 45 (of 113 total)

You must be logged in to reply to this topic. Login to reply