http://www.sqlservercentral.com/blogs/sqlrnnr/2010/05/25/space-used/

Printed 2014/12/29 08:17AM

Space Used

By Jason Brimhall, 2010/05/25

At long last I am bringing the next installment in this mini-series.  You can find the rest of the articles by perusing the last article I wrote on the topic.  That article was a correction to the TableSpace script about which I had already blogged.

As you will recall, this series was designed to compare and contrast two MS provided stored procedures.  Those procedures are sp_spaceused and sp_MStablespace.  They both calculate table sizes and information but they do not always arrive at the same result.  It was the difference in results that prompted this little series of blogs.  In this article I will be examining sp_spaceused, or at least the re-write I did for this proc.  There will likely be one more article in the series.  That article will be to compare performance and wrap up the series.

Problem

As was stated in the Tablespace article, there are two main problems with this procedure.  The first is that some measure of looping must be used, if one is to use the stored procedure as is, to gather all of the information for all of the tables in a database.  This problem can be rectified by understanding the inner workings of the stored procedure.  The second issue is that it does not return data that is consistent with that of sp_MStablespace.  I would like for the data to be consistent between the two scripts that may be used.  If the data is the same, then it will only be a matter of preference as to which script one decides to use.  This should help eliminate doubt.

Looping

Much the same as needs to be done with the sp_MStablespace procedure, a loop can be used to return the data for all tables.  Based on my experience, the looping query does not take long to execute.  It really is somewhat unnecessary though.  Here is a sample of how such a loop might be done.

Code block   
/*
Looping mechanism for sp_spaceused
*/
 
-- Create the temp table for further querying
CREATE TABLE #spaceused(
	rec_id			INT IDENTITY (1, 1),
	TableName		VARCHAR(128),
	NumRows			VARCHAR(11),
	reserved		VARCHAR(18),
	DataSpaceMB		VARCHAR(18),
	IndexSpaceMB	VARCHAR(18),
	unused			VARCHAR(18),
	TotalSizeMB		DECIMAL(15,2),
	percent_of_db	DECIMAL(15,12),
	DBSizeMB		DECIMAL(15,2)
	)
 
-- Get all tables, names, and sizes
EXEC sp_msforeachtable @command1="insert into #spaceused(TableName,NumRows,reserved, DataSpaceMB, IndexSpaceMB,unused) exec sp_spaceused '?'"
 
UPDATE #spaceused
	SET DataSpaceMB = REPLACE(DataSpaceMB,' KB','')
		,IndexSpaceMB = REPLACE(IndexSpaceMB,' KB','')
 
-- Set the TotalSizeMB and total database size fields
UPDATE #spaceused
SET TotalSizeMB = (CONVERT(DECIMAL(15,2),DataSpaceMB) + CONVERT(DECIMAL(15,2),IndexSpaceMB))/1024
	,DBSizeMB = (SELECT SUM(CONVERT(DECIMAL(15,2),DataSpaceMB) + CONVERT(DECIMAL(15,2),IndexSpaceMB))/1024 FROM #spaceused)
 
-- Set the percent of the total database size
UPDATE #spaceused
	SET percent_of_db = (TotalSizeMB/DBSizeMB) * 100
 
-- Get the data
--sp_spaceused
SELECT TableName,NumRows, CONVERT(DECIMAL(15,2),DataSpaceMB)/1024 AS DataSpaceMB
	,CONVERT(DECIMAL(15,2),IndexSpaceMB)/1024 AS IndexSpaceMB, TotalSizeMB, percent_of_db, DBSizeMB
	FROM #spaceused
	ORDER BY TotalSizeMB DESC
 
/* Drop Temp Table
DROP TABLE #spaceused
*/

In this version of the loop, it must be noted that an additional step must be taken due to the nature of the data that is returned by sp_spaceused.  This procedure returns ‘ KB’ in the output.  This additional data must be treated so proper calculations can be performed.

Inconsistencies

As stated in the aforementioned article on sp_MStablespace, the data returned between the two does not always match (in regards to table size).  The reason for this is due to a difference in calculation and it seems intent.  Thus both procs can be deemed accurate and correct.  However, that does not breed any higher level of confidence in the return data from both procedures.  I hope to elaborate somewhat on these subtle differences.

Discovery

I am using the stored procedure from SQL 2008 as the basis for my analysis.  One of the first things I have noted is that the procedure has been mostly updated in support of the new SQL 2008 objects.  The one exception is that it still uses dbo.sysfiles to try and determine the database size and logsize.  Beyond that, the proc makes use of the system views sys.partitions, sys.allocation_units, sys.internal_tables, sys.objects and sys.dm_db_partition_stats.

sys.objects

The use of this view is primarily to get the object_id of the table that was passed in to the stored procedure.  For this particular version, this can be eliminated since we are trying to get a solution that does not require looping.  For a later version, I will look into reincorporating that view into the script to allow either a single table or all tables to be returned.

sysfiles

As mentioned already, sysfiles is present in order to retrieve the database size and the logfile size.  This can be replaced by sys.database_files.  In testing, I have found that using sys.database_files is more consistent and it is also more accurate than using sysfiles.

Revision

Since this procedure was already mostly up to date with the new objects for 2005 / 2008, most of it is reusable and the changes are less dramatic than they were for the counterpart sp_MStablespace.  Much the same as the script for MStablespace, I did employ the use of CTEs to get the data that I wanted.  I use three CTEs to retrieve all of the necessary information.  I am also using the same calculations that were used in the original procedure that determined table size.  The exception being in the database size and the log size.  For ease of consumption, I will once again break the script up into smaller chunks.

Part I

Code block   
/*similar to sp_spaceused */
 
DECLARE @dbsize DECIMAL(19,2)
		,@logsize DECIMAL(19,2)
 
SET NOCOUNT ON
 
/*
**  Summary data.
*/
BEGIN
	SELECT @dbsize = SUM(CONVERT(DECIMAL(19,2),CASE WHEN type = 0 THEN SIZE ELSE 0 END)) * 8/1024
		, @logsize = SUM(CONVERT(DECIMAL(19,2),CASE WHEN type = 1 THEN SIZE ELSE 0 END)) * 8/1024
		FROM sys.database_files
 
END

This is pretty straightforward here.  I am retrieving and summing the values needed for the database size and the log size.

Part II

Code block   
BEGIN
	WITH FirstPass AS (
		SELECT OBJECT_ID,
			ReservedPage = CONVERT(DECIMAL(19,2),SUM(reserved_page_count)) * 8/1024,
			UsedPage = CONVERT(DECIMAL(19,2),SUM(used_page_count)) *8/1024,
			PageCnt = SUM(
			CONVERT(DECIMAL(19,2),CASE
				WHEN (index_id < 2)
					THEN (used_page_count)
				ELSE lob_used_page_count + row_overflow_used_page_count
				END
			)) * 8/1024,
			RowCnt = SUM(
			CASE
				WHEN (index_id < 2)
					THEN row_count
				ELSE 0
			END
			)
		FROM sys.dm_db_partition_stats
		--Where OBJECTPROPERTY(OBJECT_ID,'IsMSShipped') = 0
		GROUP BY OBJECT_ID
	)
	,InternalTables AS (
		SELECT ps.OBJECT_ID,
			ReservedPage = CONVERT(DECIMAL(19,2),SUM(reserved_page_count)) * 8/1024,
			UsedPage = CONVERT(DECIMAL(19,2),SUM(used_page_count)) *8/1024
		FROM sys.dm_db_partition_stats  ps
			INNER Join sys.internal_tables it
				ON it.OBJECT_ID = ps.OBJECT_ID
				And it.internal_type IN (202,204,211,212,213,214,215,216)
		WHERE it.parent_id = ps.OBJECT_ID
			--And OBJECTPROPERTY(ps.OBJECT_ID,'IsMSShipped') = 0
		GROUP BY ps.OBJECT_ID
	)
	,Summary AS (
		SELECT
			ObjName = OBJECT_NAME (f.OBJECT_ID),
			NumRows = MAX(f.rowcnt),
			ReservedPageMB = SUM(IsNull(f.reservedpage,0) + IsNull(i.ReservedPage,0)),
			DataSizeMB = SUM(f.PageCnt),
			IndexSizeMB = SUM(CASE WHEN (f.UsedPage + IsNull(i.UsedPage,0)) > f.PageCnt
							THEN ((f.UsedPage + IsNull(i.UsedPage,0)) - f.PageCnt) ELSE 0 END) ,-- Equivalent of max_record_size from sys.dm_db_index_physical_stats
			UnusedSpace = SUM(CASE WHEN (f.ReservedPage + IsNull(i.ReservedPage,0)) > (f.UsedPage + IsNull(i.UsedPage,0))
				THEN ((f.ReservedPage + IsNull(i.ReservedPage,0)) - (f.UsedPage + IsNull(i.UsedPage,0))) ELSE 0 END),
			DBSizeMB = @Dbsize,
			LogSizeMB = @logsize
		FROM FirstPass F
			LEFT Outer Join InternalTables i
			ON i.OBJECT_ID = f.OBJECT_ID
		GROUP BY f.OBJECT_ID
	)
	SELECT ObjName,NumRows, ReservedPageMB, DataSizeMB, IndexSizeMB, UnusedSpace, DBSizeMB, LogSizeMB,
			PercentofDB = ((IndexSizeMb + DataSizeMB) / @DBsize) * 100
 
	FROM Summary
	ORDER BY PercentofDB DESC
END

Note here that I have left code to remove MSShipped objects from the total.  This is a placeholder simply for if somebody would like to test that result.  The only thing other than that to explain is the internal_types.  I am checking for a range of internal table types just as is done in the original stored procedure.  A quick rundown of what those types is as follows (and can be found on MSDN):  202 = xml index nodes, 204 = fulltext_catalog_map, 211 = FULLTEXT_AVDL, 213 = FULLTEXT_DOCID_STATUS, 214 = FULLTEXT_INDEXED_DOCID, 215 = FULLTEXT_DOCID_FILTER, 216 = FULLTEXT_DOCID_MAP, and I was unable to locate what type 212 is.  I presume it is another FullText type.  For those types, the data is not counted as part of the data but is aggregated into the index size.

The final point with this script to note is in regards to an additional comment I left in the script.  The comment is in regards to how the size is calculated and I wanted to be certain that it was explained how that relates.  In the MStablespace script, I use sys.dm_db_index_physical_stats to help determine the index and table size.  My comment here relates to that DMF in that the spaceused version is equivalent to the max_record_size field in that DMF.  This procedure and script determine the size of a table based on the maximum size and of a page and then multiplies that across all pages.  The MStablespace script is more equivalent to calculating based on average fill of those pages.  The MStablespace script can easily be modified to support the max size and thus bring both scripts into harmony.  Beyond those points, I think the rest of the script is pretty self-explanatory.

Results

This script now permits us to get the full size of all tables in a database without the need for looping through a single stored procedure.  We also see that at this point we have outlined why the two Microsoft stored procedures returned different results.  With both scripts now returning the same data, and doing so consistently we can now begin to evaluate performance measures.


Copyright © 2002-2014 Simple Talk Publishing. All Rights Reserved. Privacy Policy. Terms of Use. Report Abuse.