Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 12»»

Using SUM(1) to count Rows (vs Count(1)) Expand / Collapse
Author
Message
Posted Wednesday, October 31, 2012 2:44 PM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Friday, May 24, 2013 9:35 AM
Points: 56, Visits: 129
I have run into an instance where a developer was using SUM(1) to count rows in a table rather than COUNT(1). Has anyone seen this before?
I ran a comparison, and COUNT is far more efficient, but I wanted to know if anyone had seen this before.

Thanks and happy Halloween!
Post #1379533
Posted Wednesday, October 31, 2012 2:51 PM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Today @ 5:35 PM
Points: 5,722, Visits: 6,194
Yeah, it's not horribly uncommon. An example of when I usually use it:

SUM( CASE WHEN x=y THEN 1 ELSE 0 END) AS FilteredCount

What kind of performance difference are you seeing? It shouldn't be that drastic if there's any kind of filter on the query. If it's a blind grab single table query, then index information would feed the count quicker instead of forcing it to go through all the rows.



- Craig Farrell

Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

For better assistance in answering your questions | Forum Netiquette
For index/tuning help, follow these directions. |Tally Tables

Twitter: @AnyWayDBA
Post #1379536
Posted Wednesday, October 31, 2012 7:17 PM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Today @ 5:02 PM
Points: 2,369, Visits: 3,247
Somewhere I heard that COUNT(*) is optimized for this specific case (not Craig's conditional count).

If you'd like a faster way that doesn't need to do a table scan, this technique also works:
http://www.mssqltips.com/sqlservertip/1044/getting-a-sql-server-rowcount-without-doing-a-table-scan/



No loops! No CURSORs! No RBAR! Hoo-uh!

INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Are you too recursively challenged?
Splitting strings based on patterns can be fast!
Post #1379578
Posted Wednesday, October 31, 2012 8:03 PM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Friday, May 24, 2013 9:35 AM
Points: 56, Visits: 129
Thanks, folks.

Kraig, I was doing a straight rowcount of a 4 million row table, so that would explain the perf difference. The code I saw used SUM(1) for counts almost exclusively, but it was almost always associated with a where clause. I couldn't think of a reason that someone would rather use SUM instead of COUNT, so I thought a straight up side to side test would show me. Looks like that might not have been a fair test!

What I saw for perf what that COUNT used about 50% less CPU time, and usually about 75% of duration as SUM did, but that was without a WHERE clause.

I'm looking for anything that will help speed up the process that is coded, and I was wondering if the effort of changing SUM to COUNT was worth it. I'll do some more investigating.

Post #1379585
Posted Thursday, November 01, 2012 10:33 AM


SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Tuesday, January 15, 2013 11:11 AM
Points: 1,945, Visits: 2,782
I have run into an instance where a developer was using SUM(1) to count rows in a table rather than COUNT(1). Has anyone seen this before?


I never ran into that one! And I see a lot of bad SQL. But the right idiom is COUNT(*) if you want the cardinality of a table. Most SQLs will see the * and go to schema information tables of some kind directly. The "COUNT(<constant>)" idiom is a very old Oracle trick that now gets optimized back to COUNT(*); it used to actually generate rows internally!

I will guess that SUM(<expression>) does not have an optimization for 1 yet. Ugh! But who would have thought you might need it?


Books in Celko Series for Morgan-Kaufmann Publishing
Analytics and OLAP in SQL
Data and Databases: Concepts in Practice
Data, Measurements and Standards in SQL
SQL for Smarties
SQL Programming Style
SQL Puzzles and Answers
Thinking in Sets
Trees and Hierarchies in SQL
Post #1379926
Posted Wednesday, November 07, 2012 7:04 AM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Thursday, June 13, 2013 9:46 AM
Points: 3,157, Visits: 4,341
The one time I saw this was assisting a client optimising their SSRS reports.
One of these reports had a sub report, in the lines of:
select Productname,
--(SELECT COUNT(1) FROM Product.Sales s WHERE s.ProductKey = Product.Productkey AND QuantityOrdered > 10000 ) AS MoreThan10000,
SUM(CASE WHEN QuantityOrdered > 10000 THEN 1 ELSE 0 END) AS MoreThan10000,
--(SELECT COUNT(1) FROM Product.Sales s WHERE s.ProductKey = Product.Productkey AND QuantityOrdered <= 10000 AND QuantityOrdered > 5000) AS Between5001And10000,
SUM(CASE WHEN QuantityOrdered <= 10000 AND QuantityOrdered > 5000 THEN 1 ELSE 0 END) AS Between5001And10000,
--(SELECT COUNT(1) FROM Product.Sales s WHERE s.ProductKey = Product.Productkey AND QuatityOrdered <=5000 AND QuantityOrdered > 0 ) AS Between1And5000,
SUM(CASE WHEN QuatityOrdered <=5000 AND QuantityOrdered > 0 THEN 1 ELSE 0 END) AS Between1And5000,
--(SELECT COUNT(1) FROM Product.Product p left join Product.Sales s On s.ProductKey = p.Productkey where p.ProductKey = Product.Productkey AND s.SalesKey IS NULL) AS NeverOrdered,
SUM(CASE WHEN Sales.SalesKey IS NULL THEN 1 ELSE 0 END) AS NeverOrdered
from Product.Product
left join Product.Sales On Sales.ProductKey = Product.Productkey
where .....

This actually performed favourably compared to the commented out subselects containing COUNT(1) with the relevant WHERE clause


____________________________________________
Space, the final frontier? not any more...
All limits henceforth are self-imposed.
“libera tute vulgaris ex”
Post #1381965
Posted Wednesday, November 28, 2012 8:09 PM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Today @ 4:32 PM
Points: 2,063, Visits: 3,787
Mister Sachmo (10/31/2012)
Thanks, folks.

Kraig, I was doing a straight rowcount of a 4 million row table, so that would explain the perf difference. The code I saw used SUM(1) for counts almost exclusively, but it was almost always associated with a where clause. I couldn't think of a reason that someone would rather use SUM instead of COUNT, so I thought a straight up side to side test would show me. Looks like that might not have been a fair test!

What I saw for perf what that COUNT used about 50% less CPU time, and usually about 75% of duration as SUM did, but that was without a WHERE clause.

I'm looking for anything that will help speed up the process that is coded, and I was wondering if the effort of changing SUM to COUNT was worth it. I'll do some more investigating.
I see this post is a couple weeks old but, to follow up with Dwain's post to the SQLTips site, the code below doesn't use the deprecated views and if you turn it into a little scalar function, will work practically instantaneously (as I'm sure you would agree, even a SELECT COUNT(1) FROM xTable with millions of rows can take a long time)
DECLARE @Table sysname = 'TableWith250millRows'
SELECT
c.row_count AS [RowCount]
FROM
sys.objects a
JOIN
sys.indexes b
ON b.OBJECT_ID = a.OBJECT_ID
JOIN
sys.dm_db_partition_stats AS c
ON b.OBJECT_ID = c.OBJECT_ID
AND b.index_id = c.index_id
WHERE
b.index_id < 2
AND a.is_ms_shipped = 0
AND a.name = @Table



______________________________________________________________________________
"Never argue with an idiot; They'll drag you down to their level and beat you with experience"
Post #1390236
Posted Wednesday, November 28, 2012 8:28 PM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Today @ 5:02 PM
Points: 2,369, Visits: 3,247
Interesting...

Didn't know those tables were deprecated.



No loops! No CURSORs! No RBAR! Hoo-uh!

INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Are you too recursively challenged?
Splitting strings based on patterns can be fast!
Post #1390238
Posted Wednesday, November 28, 2012 8:48 PM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Today @ 4:32 PM
Points: 2,063, Visits: 3,787
I was referring to sysindexes, it's eventually being replaced by sys.indexes. Or am I completely confused?

______________________________________________________________________________
"Never argue with an idiot; They'll drag you down to their level and beat you with experience"
Post #1390259
Posted Wednesday, November 28, 2012 8:56 PM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Today @ 5:02 PM
Points: 2,369, Visits: 3,247
MyDoggieJessie (11/28/2012)
I was referring to sysindexes, it's eventually being replaced by sys.indexes. Or am I completely confused?


Alas I am not the man to ask.

I failed to mention earlier how much I like your signature.



No loops! No CURSORs! No RBAR! Hoo-uh!

INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Are you too recursively challenged?
Splitting strings based on patterns can be fast!
Post #1390260
« Prev Topic | Next Topic »

Add to briefcase 12»»

Permissions Expand / Collapse