Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 

Basit's SQL Server Tips

Basit Farooq is a Lead Database Administrator, Trainer and Technical Author. He has more than a decade of IT experience in development, technical training and database administration on Microsoft SQL Server platforms. Basit has authored numerous SQL Server technical articles, and developed and implemented many successful database infrastructure, data warehouse and business intelligence projects. He holds a master's degree in computer science from London Metropolitan University, and industry standard certifications from Microsoft, Sun, Cisco, Brainbench, Prosoft and APM, including MCITP Database Administrator 2008, MCITP Database Administrator 2005, MCDBA SQL Server 2000 and MCTS .NET Framework 2.0 Web Applications.

Different approaches of counting number of rows in a table

Today, I received an email from the developer asking if there is a better way instead of the COUNT (*) Transact-SQL statement, to count the number of records in a table. My reply to his question is yes, there are several methods to get this information from SQL Server. However, none of these approaches are perfect, and has its own disadvantages. In this blog post, I will show different methods of count number of rows in a table.

Before discussing the different approaches, I must emphasize that the COUNT (*) statement gives you the true count of the total number of rows in a table. The COUNT (*) statement performs the full table scan on heap table and cluster index scan on tables with clustered index, to get the exact count of the records in a table. Because of this, it can get slower as the table gets bigger, as effectively it is counting each row separately, which includes the rows that contains null values. For more information about COUNT (*), see MSDN resource here.

 Let’s take a look at different approaches of counting number of records in a table:

Approach 1: Counting rows using sys.partitions catalog view

As we know that we can use sys.partitions catalog view to check the structure of the table. This catalog returns one row for each partition of all tables and most types of indexes (except Full-Text, Spatial, and XML are not included in this view) in the database. We can join the sys.partitions catalog view with sys.tables catalog view to quickly get row count for all tables in the database. Here is the sample sys.partitions query, which I use to get the row count for all tables in the database:

SET NOCOUNT ON;
SET STATISTICS IO ON;

-- Ensure a USE [databasename] statement has been executed first.
SELECT SCHEMA_NAME(t.[schema_id]) AS [table_schema]
      ,OBJECT_NAME(p.[object_id]) AS [table_name]
      ,SUM(p.[rows]) AS [row_count]
FROM [sys].[partitions] p
INNER JOIN [sys].[tables] t ON p.[object_id] = t.[object_id]
WHERE p.[index_id] < 2
GROUP BY p.[object_id]
	,t.[schema_id]
ORDER BY 1, 2 ASC
OPTION (RECOMPILE);

Here is output when I run it against AdventureWork2012 system database:

01

Although it is one of the fastest ways to count the number of rows in a table, however, according to sys.partitions documentation, the count is not always accurate. I personally could not find anything on the MSDN that tells me the cases where the count will not be accurate. The only situation in which I found count is not accurate is when I run this query while DML operations are in progress against the table for which I am counting.

Approach 2: Counting table rows using sys.dm_db_partition_stats dynamic management view

Like sys.partitions, we can use sys.dm_db_partition_stats dynamic management view to count the number of rows in a table. This dynamic management view contains row-count information for every partition and displays the information about the space used to store and manage different data allocation unit types.  According to MSDN, the row_count column of sys.dm_db_partition_stats dynamic management view is approximate value, and Microsoft never reveals cases where counts are not accurate. However, similar to sys.partitions catalog view, the only situation in which I found count is not accurate is when I use it while DML operations are in progress against the table for which I am counting. Here is my query, which combines sys.dm_db_partition_stats dynamic management view and sys.tables catalog view, to get the row count for all tables in the database:

SET NOCOUNT ON;
SET STATISTICS IO ON;

-- Ensure a USE [databasename] statement has been executed first.
SELECT SCHEMA_NAME(t.[schema_id]) AS [table_schema]
      ,t.[name] AS [table_name]
      ,SUM(ps.[row_count]) AS [row_count]
FROM [sys].[tables] t
INNER JOIN [sys].[dm_db_partition_stats] ps
     ON ps.[object_id] = t.[object_id]
WHERE [index_id] < 2
GROUP BY t.[name]
	,t.[schema_id]
ORDER BY 1, 2 ASC
OPTION (RECOMPILE);

Here is output when I run it against AdventureWork2012 system database:

02

When I compared STATISTICS IO output of both queries, I noticed that the sys.dm_db_partition_stats dynamic management view query version is slightly faster than sys.partitions catalog view query version. Examine the STATISTICS IO output of both queries that shows that SQL Server is doing a less work to retrieve the row count information using sys.dm_db_partition_stats dynamic management view query version.

STATISTICS IO output for sys.partitions version of query:

Table 'syssingleobjrefs'. Scan count 75, logical reads 151, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'sysidxstats'. Scan count 77, logical reads 175, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'sysschobjs'. Scan count 0, logical reads 768, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'sysrowsets'. Scan count 1, logical reads 8, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

STATISTICS IO output for sys.dm_db_partition_stats version of query:

Table 'syssingleobjrefs'. Scan count 75, logical reads 151, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'sysidxstats'. Scan count 76, logical reads 164, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'sysschobjs'. Scan count 0, logical reads 384, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Approach 3: Counting table rows using sp_spaceused system stored procedure

We can use the sp_spaceused system-stored procedure to return space usage information about a database or a table within a database. The syntax for this procedure is:

sp_spaceused [@objectname=['objectname']]
[,[@updateusage=['TRUE|FASLE']]]

You use the objectname parameter to enter the name of a table, indexed view, or Service Broker queue to retrieve space usage information specific to that object. If you set the updateusage to TRUE, DBCC UPDATEUSAGE updates usage information. The default value for this parameter is FALSE. You can run the command without any parameters to retrieve information about the current database:

EXEC [sp_spaceused]

This returns the following information:

  • Current database name
  • Current database size
  • Unallocated space
  • Reserved space
  • Space used by data
  • Space used by indexes
  • Unused space

To view information for a table in the current database, you must include the table name. For example, to look at information for the Person.Address table in the AdventureWorks2012 database, you would run:

EXEC [sp_spaceused] 'Person.Address'

This would return:

  • Table name
  • Number of rows in the table
  • Reserved space
  • Space used by data
  • Space used by indexes
  • Unused space

Here is the sample output in SSMS:

03

It is not necessary to include the parameter names (@parametername =) when running this command.

Finally, here is my script that uses sp_spaceused system-stored procedure to return the table usage information for all database tables, which includes count of rows in the table:

SET NOCOUNT ON;

-- Ensure a USE [databasename] statement has been executed first.
DECLARE @Database		            [nvarchar] (256)
       ,@TSQLCommand01              [nvarchar] (MAX)

SET @Database = DB_NAME()

IF OBJECT_ID(N'TempDb.dbo.#Table_Size_Info') IS NOT NULL
	DROP TABLE #Table_Size_Info

CREATE TABLE #Table_Size_Info (
	 [ID] [int] IDENTITY(1, 1) PRIMARY KEY
	,[ObjectName] [sysname]
	,[NumRows] [bigint]
	,[Reserved] [varchar](30)
	,[Data] [varchar](30)
	,[IndexSize] [varchar](30)
	,[Unused] [varchar](30)
	,[ObjectType] [char](1)
	)

SET @TSQLCommand01 = N''
SET @TSQLCommand01 = N'USE' + SPACE(1) + QUOTENAME(@Database) + N';' + SPACE(1) + CHAR(13)
SET @TSQLCommand01 = @TSQLCommand01 + N'DECLARE @SQLStatementID02 [smallint] ,' + SPACE(1) + CHAR(13)
SET @TSQLCommand01 = @TSQLCommand01 + N'@CurrentObjectSchema         [sysname] ,' + SPACE(1) + CHAR(13)
SET @TSQLCommand01 = @TSQLCommand01 + N'@CurrentObjectName           [sysname] ,' + SPACE(1) + CHAR(13)
SET @TSQLCommand01 = @TSQLCommand01 + N'@CurrentObjectFullName [sysname] ,' + SPACE(1) + CHAR(13)
SET @TSQLCommand01 = @TSQLCommand01 + N'@CurrentObjectType           [char](1)' + SPACE(1) + CHAR(13)
SET @TSQLCommand01 = @TSQLCommand01 + N'DECLARE @AllObjects TABLE ( [ID] [int] IDENTITY(1, 1) PRIMARY KEY , [ObjectSchema] [sysname] , [ObjectName] [sysname] , [ObjectType] [char](1) , [Completed] [bit] );' + SPACE(1) + CHAR(13)
SET @TSQLCommand01 = @TSQLCommand01 + N'INSERT INTO @AllObjects ([ObjectSchema], [ObjectName], [ObjectType], [Completed])' + CHAR(13)
SET @TSQLCommand01 = @TSQLCommand01 + N'SELECT  [TABLE_SCHEMA] , [TABLE_NAME] , N''T'' , 0' + CHAR(13)
SET @TSQLCommand01 = @TSQLCommand01 + N'FROM    [INFORMATION_SCHEMA].[TABLES]' + CHAR(13)
SET @TSQLCommand01 = @TSQLCommand01 + N'WHERE   [TABLE_TYPE] = N''BASE TABLE''' + CHAR(13)
SET @TSQLCommand01 = @TSQLCommand01 + N'AND CHARINDEX(N'''''''' , [TABLE_NAME]) = 0' + CHAR(13)
SET @TSQLCommand01 = @TSQLCommand01 + N'ORDER BY [TABLE_SCHEMA], [TABLE_NAME]' + CHAR(13)
SET @TSQLCommand01 = @TSQLCommand01 + N'SELECT @SQLStatementID02 = MIN([ID]) FROM @AllObjects WHERE [Completed] = 0' + CHAR(13)
SET @TSQLCommand01 = @TSQLCommand01 + N'WHILE @SQLStatementID02 IS NOT NULL' + CHAR(13)
SET @TSQLCommand01 = @TSQLCommand01 + N'BEGIN' + CHAR(13)
SET @TSQLCommand01 = @TSQLCommand01 + CHAR(9) + N'SELECT @CurrentObjectSchema = [ObjectSchema] , @CurrentObjectName = [ObjectName] , @CurrentObjectType = [ObjectType]' + CHAR(13)
SET @TSQLCommand01 = @TSQLCommand01 + CHAR(9) + N'FROM @AllObjects WHERE [ID] = @SQLStatementID02' + CHAR(13)
SET @TSQLCommand01 = @TSQLCommand01 + CHAR(9) + N'SET @CurrentObjectFullName = QUOTENAME(@CurrentObjectSchema) + ''.'' + QUOTENAME(@CurrentObjectName)' + CHAR(13)
SET @TSQLCommand01 = @TSQLCommand01 + CHAR(9) + N'INSERT INTO #Table_Size_Info ([ObjectName] , [NumRows] , [Reserved] , [Data] , [IndexSize] , [Unused] )' + CHAR(13) + N'EXEC [sp_spaceused] @CurrentObjectFullName' + CHAR(13)
SET @TSQLCommand01 = @TSQLCommand01 + CHAR(9) + N'UPDATE #Table_Size_Info SET [ObjectName] = @CurrentObjectFullName , [ObjectType] = @CurrentObjectType WHERE [ID] = SCOPE_IDENTITY();' + CHAR(13)
SET @TSQLCommand01 = @TSQLCommand01 + CHAR(9) + N'UPDATE @AllObjects' + CHAR(13) + N'SET [Completed] = 1' + CHAR(13) + N'WHERE [ID] = @SQLStatementID02' + CHAR(13)
SET @TSQLCommand01 = @TSQLCommand01 + CHAR(9) + N'SELECT @SQLStatementID02 = MIN([ID]) FROM @AllObjects WHERE [Completed] = 0' + CHAR(13)
SET @TSQLCommand01 = @TSQLCommand01 + N'END' + CHAR(13)

EXEC [sp_executesql] @TSQLCommand01

SELECT *
FROM #Table_Size_Info
GO

Here is the output when I executed against AdventureWorks2012 database:

04

I have also compared STATISTICS IO output of this query with above two queries and noticed that it is much slower than other two queries that I shared in this blog post, but still much faster than equivalent SELECT COUNT(*) version.

I hope you liked this post :-) . Please feel free to use comment section, If you have any question regarding this post. You can also drop me question via twitter (@BasitAali).


Comments

Leave a comment on the original post [basitaalishan.com, opens in a new window]

Loading comments...