SELECT COUNT(*) or COUNT(1) Performance or Benefits?

Question

SELECT COUNT(*) or COUNT(1) Performance or Benefits?

Express12

SSCoach

Points: 18514
More actions
May 10, 2005 at 9:00 am

#68991

What are the advantages of using COUNT(*) -or- COUNT(1) in the following examples:
declare @Address1 int
select @Address1 = count(1) from dbo.Customers where ID < 0
select @Address1 as Address1count
declare @Address1 int
select @Address1 = count(*) from dbo.Customers where ID < 0
select @Address1 as Address1count

BT

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 1

* has the advantage that sql server will determine which is the best way to find the count(*). This will allow the server to do an index seek/scan without actually fetching the data on the hd. count(1|colname) doesn't always offer this possibility.

I know somebody else will have a more complete explanation of this process.

David.Poole SSC Guru Points: 75903 More actions · Answer 2

COUNT(*) or count(1) returns a count of all records including NULLS.

COUNT(expression) returns a count of all non-null records.

The first two don't seem to have any performance difference or differences in the execution plan they generate.

LinkedIn Profile

Dinakar Nethi-176633 SSCarpal Tunnel Points: 4689 More actions · Answer 3

I think Remi is right in that when you use COUNT(*) the optimizer chooses which index to use and returns the count faster as compared to COUNT(1) which need not necessarily be an indexed column.

******************
Dinakar Nethi
Life is short. Enjoy it.
******************

Farrell Keough SSCoach Points: 17414 More actions · Answer 4

Darn good question!!!

I wasn't born stupid - I had to study.

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 5

From books online on msdn :

count(*)

Specifies that all rows should be counted to return the total number of rows in a table. COUNT(*) takes no parameters and cannot be used with DISTINCT. COUNT(*) does not require an expression parameter because, by definition, it does not use information about any particular column. COUNT(*) returns the number of rows in a specified table without eliminating duplicates. It counts each row separately, including rows that contain null values.

And from a post on Oracle (ok it's not sql server but the same logic applies).

Count(*) returns the total row count

count(1) returns all rows where 1 is not null >> simplified version it returns all rows.

So Oracle is optimized to use count(*) instead of count(number) because count(1) wouldn't offer any performance gain. And I can only assume that sql server will use the same logic. I've also tried many different variation of count() but unless the specified column allows nulls the plan is always the same : index scan. With a non-indexed nullable column : clustered index scan.

You can run this and see for yourself :

Select count(autoval) from dbo.SysColumns --cis

Select count(*) from dbo.SysColumns --is

Select count(1) from dbo.SysColumns --is

Select count(id) from dbo.SysColumns --is

Chris Hedgate One Orange Chip Points: 25041 More actions · Answer 6

My first article at SQL Server Central discussed this: Advice on using COUNT( )

As David says, COUNT(*) or COUNT(1) (or COUNT(9917456) or whatever) will be treated exactly the same way by the optimizer. They both simply say "count the number of rows in the set", so the optimizer is free to do that exactly as it whishes. The fastest way to count the number of rows in a table is to do a scan on the most narrow non-clustered index of the table, so that is that plan that will be chosen if possible.

--
Chris Hedgate http://www.hedgate.net/
Contributor to the Best of SQL Server Central volumes
Articles: http://www.sqlservercentral.com/columnists/chedgate/

peterhe SSChampion Points: 11469 More actions · Answer 7

It's upto what you are going to achieve. If the first column is NOT NULL, COUNT(*) and COUNT(1) have the same result. If it's NULLable, the result may be different.

USE northwind

GO

SET NOCOUNT ON

SET ANSI_WARNINGS ON

select COUNT(*) FROM dbo.Customers

select COUNT(CustomerID) FROM dbo.Customers

select COUNT(Region) FROM dbo.Customers

select COUNT(FAX) FROM dbo.Customers

Result:

-----------

91

-----------

91

-----------

31

Warning: Null value is eliminated by an aggregate or other SET operation.

-----------

69

Warning: Null value is eliminated by an aggregate or other SET operation.

Their execution plans are slightly different. The last one uses clustered index scan and others use index scan on my machine.

select COUNT(*) FROM dbo.Customers

StmtText

---------------------------------------------------------------------------

|--Compute Scalar(DEFINE[Expr1002]=Convert([Expr1003])))

|--Stream Aggregate(DEFINE[Expr1003]=Count(*)))

|--Index Scan(OBJECT[Northwind].[dbo].[Customers].[Region]))

select COUNT(CustomerID) FROM dbo.Customers

StmtText

---------------------------------------------------------------------------

|--Compute Scalar(DEFINE[Expr1002]=Convert([Expr1003])))

|--Stream Aggregate(DEFINE[Expr1003]=Count(*)))

|--Index Scan(OBJECT[Northwind].[dbo].[Customers].[Region]))

select COUNT(Region) FROM dbo.Customers

StmtText

--------------------------------------------------------------------------------

|--Compute Scalar(DEFINE[Expr1002]=Convert([Expr1003])))

|--Stream Aggregate(DEFINE[Expr1003]=COUNT_BIG([Customers].[Region])))

|--Index Scan(OBJECT[Northwind].[dbo].[Customers].[Region]))

select COUNT(FAX) FROM dbo.Customers

StmtText

-------------------------------------------------------------------------------------------

|--Compute Scalar(DEFINE[Expr1002]=Convert([Expr1003])))

|--Stream Aggregate(DEFINE[Expr1003]=COUNT_BIG([Customers].[Fax])))

|--Clustered Index Scan(OBJECT[Northwind].[dbo].[Customers].[PK_Customers]))

Chris Hedgate One Orange Chip Points: 25041 More actions · Answer 8

>If the first column is NOT NULL, COUNT(*) and COUNT(1) have the same result. If it's NULLable, the result may be different.

Could you elaborate on that please?

--
Chris Hedgate http://www.hedgate.net/
Contributor to the Best of SQL Server Central volumes
Articles: http://www.sqlservercentral.com/columnists/chedgate/

peterhe SSChampion Points: 11469 More actions · Answer 9

Maybe I didn't make me clear. What I mean is that if the Nth column can have NULL value in the table, COUNT(*) and COUNT(N) may have different value.

In the example I given above,

the CustomerID is NOT NULL in the table; Fax and Region can be NULL in the table and some records REALLY have null values for the them. You can see the four COUNT() give different results.

The table is defined as:

CREATE TABLE [dbo].[Customers] (

[CustomerID] [nchar] (5) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,

[CompanyName] [nvarchar] (40) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,

[ContactName] [nvarchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

[ContactTitle] [nvarchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

[Address] [nvarchar] (60) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

[City] [nvarchar] (15) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

[Region] [nvarchar] (15) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

[PostalCode] [nvarchar] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

[Country] [nvarchar] (15) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

[Phone] [nvarchar] (24) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

[Fax] [nvarchar] (24) COLLATE SQL_Latin1_General_CP1_CI_AS NULL

) ON [PRIMARY]

GO

ALTER TABLE [dbo].[Customers] WITH NOCHECK ADD

CONSTRAINT [PK_Customers] PRIMARY KEY CLUSTERED

(

[CustomerID]

&nbsp ON [PRIMARY]

GO

CREATE INDEX [City] ON [dbo].[Customers]([City]) ON [PRIMARY]

GO

CREATE INDEX [CompanyName] ON [dbo].[Customers]([CompanyName]) ON [PRIMARY]

GO

CREATE INDEX [PostalCode] ON [dbo].[Customers]([PostalCode]) ON [PRIMARY]

GO

CREATE INDEX [Region] ON [dbo].[Customers]([Region]) ON [PRIMARY]

GO

Chris Hedgate One Orange Chip Points: 25041 More actions · Answer 10

Yes, you are right in that the queries return different results. The reasons were also mentioned earlier in this thread and in my article. As noted the reason is that COUNT(expression), where expression might simply be the name of a column, does not count occurences where expression evaluates to null. Since COUNT(*) counts every row, there is no concept of null involved (if there is a row then there is one, no matter how many of the columns on it that are null, or even might be) and therefore the results might differ. From the results we can see that there are 91 rows in the table Northwind.dbo.Customers, of which 60 (91-31) do not have a value (they are null) for column Region and 22 (91-69) do not have a value for column Fax.

You are also correct in that whether a column allows nulls or not might affect which execution plan is chosen. If a column that does not allow nulls is used as the expression, such as COUNT(CustomerId), this has the same meaning as COUNT(*) since all rows should be counted. Therefore the same plans will be used. If it does allow null then there might be rows with null and therefore the expression must be evaluated for each row (here evaluated means that the value of the column must be checked). If there is an index on the column then a scan of that index will be used (as in the COUNT(Region) example query), and if there is not an index then a table scan/clustered index scan will be used (as with the COUNT(Fax) example).

However, to finally get back to why I asked you to elaborate, I think you have misunderstood COUNT(1). The 1 does not indicate the ordinal number of a column, it is simply an expression that will always be true. So COUNT(*) and COUNT(1) will never return different results and will always use the same plan, since they are treated as identical queries by the optimizer.

--
Chris Hedgate http://www.hedgate.net/
Contributor to the Best of SQL Server Central volumes
Articles: http://www.sqlservercentral.com/columnists/chedgate/

tzara Say Hey Kid Points: 663 More actions · Answer 11

i started using count(1) when count(*) appeared not to return results in a timely manner (on very large table, lots of rows/columns). if truly implemented the same way by sql server, it must be serendipity i experienced a difference in performance.

that could be due to trying first with count(*), getting the table loaded in memory in the process, then moving on to using count(1). i think of '1' taking up less memory internally to process, but there's no real foundation for that.

Chris Hedgate One Orange Chip Points: 25041 More actions · Answer 12

Well, it might of course turn out that I went too far when I said that "COUNT(*) and COUNT(1) will never return different results and will always use the same plan, since they are treated as identical queries by the optimizer", since I have no way of knowing that for 100% certainty. (Oh, what I wouldn't do for the SQL Server source code)

That said, I would like more information from your example to be convinced that they were not treated the same in your specific situation. It is very probable that the reason is that the pages were fetched from disk in the first case and directly from cache in the second, just as you speculate.

>i think of '1' taking up less memory internally to process, but there's no real foundation for that.

In what way would it take up less memory than '*'? They are both nothing more than a character in a sql statement that is parsed by the optimizer into an execution plan. The execution plans for COUNT(*) and COUNT(1) have always been equal in the situations that I have encountered, but if they were not in your example (or any other situation) then of course there could be a difference in execution time.

--
Chris Hedgate http://www.hedgate.net/
Contributor to the Best of SQL Server Central volumes
Articles: http://www.sqlservercentral.com/columnists/chedgate/