SQL Clone
SQLServerCentral is supported by Redgate
Log in  ::  Register  ::  Not logged in

Estimating Data Compression Savings in SQL Server 2008 and 2008 R2

Since I finished upgrading the last of my Production environment to SQL Server 2008 R2 running on Windows Server 2008 R2 a couple of weeks ago, I have been spending some time re-evaluating which indexes can take advantage of data compression in SQL Server 2008 R2. By design, there is no “Compress Entire Database” command in SQL Server 2008 R2. Instead, you need to evaluate individual indexes, based on their size, estimated savings and volatility. The ideal case is a large table that shows very good compression savings that is read-only. A bad candidate is a small table, that does not show much compression savings, with very volatile data.

I quickly got tired of manually running sp_estimate_data_compression_savings with hard-coded parameters for each index in a database, so I decided to write some T-SQL that would somewhat automate the process.  If you set the schema name, table name, and desired data compression type in the variable declarations at the top of the script, you will get some pretty detailed information about all of the indexes in that table.

You should run the entire query at once, after you have supplied your own values. It may take a few seconds to run, depending on your hardware and on how large your tables are.  You could also wrap part of this in a stored procedure that you could call for each table in a database, and have it write the results out to a table that you could easily query later.

-- SQL Server 2008 and R2 Data Compression Queries
-- Glenn Berry 
-- August 2010
-- http://glennberrysqlperformance.spaces.live.com/
-- Twitter: GlennAlanBerry

-- Get estimated data compression savings and other index info
-- for every index in the specified table
DECLARE @SchemaName sysname = N'dbo';                  -- Specify schema name
DECLARE @TableName sysname = N'WhiteListSearchLog';    -- Specify table name
DECLARE @IndexID int = 1;
DECLARE @CompressionType nvarchar(60) = N'PAGE'        -- Specify data compression type (PAGE, ROW, or NONE)

-- Get Table name, row count, and compression status 
-- for clustered index or heap table
SELECT OBJECT_NAME(object_id) AS [ObjectName], 
SUM(Rows) AS [RowCount], data_compression_desc AS [CompressionType]
FROM sys.partitions 
WHERE index_id < 2 --ignore the partitions from the non-clustered index if any
AND OBJECT_NAME(object_id) = @TableName
GROUP BY object_id, data_compression_desc

-- Breaks down buffers used by current database by object (table, index) in the buffer pool
SELECT OBJECT_NAME(p.[object_id]) AS [ObjectName], 
p.index_id, COUNT(*)/128 AS [Buffer size(MB)],  COUNT(*) AS [BufferCount], 
p.data_compression_desc AS [CompressionType]
FROM sys.allocation_units AS a
INNER JOIN sys.dm_os_buffer_descriptors AS b
ON a.allocation_unit_id = b.allocation_unit_id
INNER JOIN sys.partitions AS p
ON a.container_id = p.hobt_id
WHERE b.database_id = DB_ID()
AND OBJECT_NAME(p.[object_id]) = @TableName
AND p.[object_id] > 100
GROUP BY p.[object_id], p.index_id, p.data_compression_desc
ORDER BY [BufferCount] DESC;

-- Shows you which indexes are taking the most space in the buffer cache

-- Get current and estimated size for every index in specified table
    -- Get list of index IDs for this table
    SELECT s.index_id
    FROM sys.dm_db_index_usage_stats AS s
    WHERE OBJECT_NAME(s.[object_id]) = @TableName
    AND s.database_id = DB_ID()
    ORDER BY s.index_id;

OPEN curIndexID;

FROM curIndexID
INTO @IndexID;

-- Loop through every index in the table and run sp_estimate_data_compression_savings

        -- Get current and estimated size for specified index with specified compression type
        EXEC sp_estimate_data_compression_savings @SchemaName, @TableName, @IndexID, NULL, @CompressionType;

        FETCH NEXT
        FROM curIndexID
        INTO @IndexID;


CLOSE curIndexID;

-- Index Read/Write stats for a single table
SELECT OBJECT_NAME(s.[object_id]) AS [TableName], 
i.name AS [IndexName], i.index_id,
SUM(user_seeks) AS [User Seeks], SUM(user_scans) AS [User Scans], 
SUM(user_lookups)AS [User Lookups],
SUM(user_seeks + user_scans + user_lookups)AS [Total Reads], 
SUM(user_updates) AS [Total Writes]     
FROM sys.dm_db_index_usage_stats AS s
INNER JOIN sys.indexes AS i
ON s.[object_id] = i.[object_id]
AND i.index_id = s.index_id
WHERE OBJECTPROPERTY(s.[object_id],'IsUserTable') = 1
AND s.database_id = DB_ID()
AND OBJECT_NAME(s.[object_id]) = @TableName
GROUP BY OBJECT_NAME(s.[object_id]), i.name, i.index_id
ORDER BY [Total Writes] DESC, [Total Reads] DESC;

-- Get basic index information (does not include filtered indexes or included columns)
EXEC sp_helpindex @TableName;

-- Get size and available space for files in current database
SELECT name AS [File Name] , physical_name AS [Physical Name], size/128.0 AS [Total Size in MB],
size/128.0 - CAST(FILEPROPERTY(name, 'SpaceUsed') AS int)/128.0 AS [Available Space In MB], [file_id]
FROM sys.database_files;


Posted by cjones on 24 August 2010

If table is a schema other than dbo, then schema and table name needs to be passed to sp_helpindex.

-- Get basic index information (does not include filtered indexes or included columns)

DECLARE @Name NVARCHAR(776) = @SchemaName + '.' +@TableName;

EXEC sp_helpindex @Name;

Posted by Glenn Berry on 24 August 2010

Good catch. I will change that and make a couple of other changes in my original blog, but the SSC syndication won't pick up the changes here.

Posted by wagnerbianchijr on 28 August 2010

I am getting an error when testing your script:

Msg 15009, Level 16, State 1, Procedure sp_helpindex, Line 37

The object 'WhiteListSearchLog' does not exist in database 'master' or is invalid for this operation.

Need some workaround to run this script?

Posted by Glenn Berry on 29 August 2010

You need to be pointing at your database instead of master, and you need to specify a table name that exists in your database. WhiteListSearchLog is a table name in one of my databases.

Posted by SQL_ME_RICH on 9 May 2014

Is there an updated version of this script?  I am trying it out as is on a local instance of mine, and it takes a very long time to run against even my smaller tables.  I realize this post is 4 years old, but the analysis is definitely something I am needing to do.

Leave a Comment

Please register or log in to leave a comment.