Using SUM(1) to count Rows (vs Count(1))

  • I have run into an instance where a developer was using SUM(1) to count rows in a table rather than COUNT(1). Has anyone seen this before?

    I ran a comparison, and COUNT is far more efficient, but I wanted to know if anyone had seen this before.

    Thanks and happy Halloween!

  • Yeah, it's not horribly uncommon. An example of when I usually use it:

    SUM( CASE WHEN x=y THEN 1 ELSE 0 END) AS FilteredCount

    What kind of performance difference are you seeing? It shouldn't be that drastic if there's any kind of filter on the query. If it's a blind grab single table query, then index information would feed the count quicker instead of forcing it to go through all the rows.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Somewhere I heard that COUNT(*) is optimized for this specific case (not Craig's conditional count).

    If you'd like a faster way that doesn't need to do a table scan, this technique also works:

    http://www.mssqltips.com/sqlservertip/1044/getting-a-sql-server-rowcount-without-doing-a-table-scan/


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Thanks, folks.

    Kraig, I was doing a straight rowcount of a 4 million row table, so that would explain the perf difference. The code I saw used SUM(1) for counts almost exclusively, but it was almost always associated with a where clause. I couldn't think of a reason that someone would rather use SUM instead of COUNT, so I thought a straight up side to side test would show me. Looks like that might not have been a fair test!

    What I saw for perf what that COUNT used about 50% less CPU time, and usually about 75% of duration as SUM did, but that was without a WHERE clause.

    I'm looking for anything that will help speed up the process that is coded, and I was wondering if the effort of changing SUM to COUNT was worth it. I'll do some more investigating.

  • This was removed by the editor as SPAM

  • Mister Sachmo (10/31/2012)


    Thanks, folks.

    Kraig, I was doing a straight rowcount of a 4 million row table, so that would explain the perf difference. The code I saw used SUM(1) for counts almost exclusively, but it was almost always associated with a where clause. I couldn't think of a reason that someone would rather use SUM instead of COUNT, so I thought a straight up side to side test would show me. Looks like that might not have been a fair test!

    What I saw for perf what that COUNT used about 50% less CPU time, and usually about 75% of duration as SUM did, but that was without a WHERE clause.

    I'm looking for anything that will help speed up the process that is coded, and I was wondering if the effort of changing SUM to COUNT was worth it. I'll do some more investigating.

    I see this post is a couple weeks old but, to follow up with Dwain's post to the SQLTips site, the code below doesn't use the deprecated views and if you turn it into a little scalar function, will work practically instantaneously (as I'm sure you would agree, even a SELECT COUNT(1) FROM xTable with millions of rows can take a long time) DECLARE @Table sysname = 'TableWith250millRows'

    SELECT

    c.row_count AS [RowCount]

    FROM

    sys.objects a

    JOIN

    sys.indexes b

    ON b.OBJECT_ID = a.OBJECT_ID

    JOIN

    sys.dm_db_partition_stats AS c

    ON b.OBJECT_ID = c.OBJECT_ID

    AND b.index_id = c.index_id

    WHERE

    b.index_id < 2

    AND a.is_ms_shipped = 0

    AND a.name = @Table

    ______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience

  • Interesting...

    Didn't know those tables were deprecated.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • I was referring to sysindexes, it's eventually being replaced by sys.indexes. Or am I completely confused? :ermm:

    ______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience

  • MyDoggieJessie (11/28/2012)


    I was referring to sysindexes, it's eventually being replaced by sys.indexes. Or am I completely confused? :ermm:

    Alas I am not the man to ask.

    I failed to mention earlier how much I like your signature. 😀


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • MyDoggieJessie (11/28/2012)


    I was referring to sysindexes, it's eventually being replaced by sys.indexes. Or am I completely confused? :ermm:

    sysindexes is indeed deprecated. It's already been replaced with sys.indexes (7 odd years ago), sysindexes is only included for backward compat with SQL 2000.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply