Blog Post

Estimating Data Compression Savings in SQL Server 2008 and 2008 R2

,

Since I finished upgrading the last of my Production environment to SQL Server 2008 R2 running on Windows Server 2008 R2 a couple of weeks ago, I have been spending some time re-evaluating which indexes can take advantage of data compression in SQL Server 2008 R2. By design, there is no “Compress Entire Database” command in SQL Server 2008 R2. Instead, you need to evaluate individual indexes, based on their size, estimated savings and volatility. The ideal case is a large table that shows very good compression savings that is read-only. A bad candidate is a small table, that does not show much compression savings, with very volatile data.

I quickly got tired of manually running sp_estimate_data_compression_savings with hard-coded parameters for each index in a database, so I decided to write some T-SQL that would somewhat automate the process.  If you set the schema name, table name, and desired data compression type in the variable declarations at the top of the script, you will get some pretty detailed information about all of the indexes in that table.

You should run the entire query at once, after you have supplied your own values. It may take a few seconds to run, depending on your hardware and on how large your tables are.  You could also wrap part of this in a stored procedure that you could call for each table in a database, and have it write the results out to a table that you could easily query later.

-- SQL Server 2008 and R2 Data Compression Queries
-- Glenn Berry 
-- August 2010
-- http://glennberrysqlperformance.spaces.live.com/
-- Twitter: GlennAlanBerry
-- Get estimated data compression savings and other index info
-- for every index in the specified table
SET NOCOUNT ON;
DECLARE @SchemaName sysname = N'dbo';                  -- Specify schema name
DECLARE @TableName sysname = N'WhiteListSearchLog';    -- Specify table name
DECLARE @IndexID int = 1;
DECLARE @CompressionType nvarchar(60) = N'PAGE'        -- Specify data compression type (PAGE, ROW, or NONE)
-- Get Table name, row count, and compression status 
-- for clustered index or heap table
SELECT OBJECT_NAME(object_id) AS [ObjectName], 
SUM(Rows) AS [RowCount], data_compression_desc AS [CompressionType]
FROM sys.partitions 
WHERE index_id < 2 --ignore the partitions from the non-clustered index if any
AND OBJECT_NAME(object_id) = @TableName
GROUP BY object_id, data_compression_desc
ORDER BY SUM(Rows) DESC;
-- Breaks down buffers used by current database by object (table, index) in the buffer pool
SELECT OBJECT_NAME(p.[object_id]) AS [ObjectName], 
p.index_id, COUNT(*)/128 AS [Buffer size(MB)],  COUNT(*) AS [BufferCount], 
p.data_compression_desc AS [CompressionType]
FROM sys.allocation_units AS a
INNER JOIN sys.dm_os_buffer_descriptors AS b
ON a.allocation_unit_id = b.allocation_unit_id
INNER JOIN sys.partitions AS p
ON a.container_id = p.hobt_id
WHERE b.database_id = DB_ID()
AND OBJECT_NAME(p.[object_id]) = @TableName
AND p.[object_id] > 100
GROUP BY p.[object_id], p.index_id, p.data_compression_desc
ORDER BY [BufferCount] DESC;
-- Shows you which indexes are taking the most space in the buffer cache
-- Get current and estimated size for every index in specified table
DECLARE curIndexID CURSOR
FAST_FORWARD
FOR
    -- Get list of index IDs for this table
    SELECT s.index_id
    FROM sys.dm_db_index_usage_stats AS s
    WHERE OBJECT_NAME(s.[object_id]) = @TableName
    AND s.database_id = DB_ID()
    ORDER BY s.index_id;
        
OPEN curIndexID;
FETCH NEXT
FROM curIndexID
INTO @IndexID;
-- Loop through every index in the table and run sp_estimate_data_compression_savings
WHILE @@FETCH_STATUS = 0
    BEGIN
        -- Get current and estimated size for specified index with specified compression type
        EXEC sp_estimate_data_compression_savings @SchemaName, @TableName, @IndexID, NULL, @CompressionType;
        FETCH NEXT
        FROM curIndexID
        INTO @IndexID;
    END
CLOSE curIndexID;
DEALLOCATE curIndexID;
-- Index Read/Write stats for a single table
SELECT OBJECT_NAME(s.[object_id]) AS [TableName], 
i.name AS [IndexName], i.index_id,
SUM(user_seeks) AS [User Seeks], SUM(user_scans) AS [User Scans], 
SUM(user_lookups)AS [User Lookups],
SUM(user_seeks + user_scans + user_lookups)AS [Total Reads], 
SUM(user_updates) AS [Total Writes]     
FROM sys.dm_db_index_usage_stats AS s
INNER JOIN sys.indexes AS i
ON s.[object_id] = i.[object_id]
AND i.index_id = s.index_id
WHERE OBJECTPROPERTY(s.[object_id],'IsUserTable') = 1
AND s.database_id = DB_ID()
AND OBJECT_NAME(s.[object_id]) = @TableName
GROUP BY OBJECT_NAME(s.[object_id]), i.name, i.index_id
ORDER BY [Total Writes] DESC, [Total Reads] DESC;
-- Get basic index information (does not include filtered indexes or included columns)
EXEC sp_helpindex @TableName;
-- Get size and available space for files in current database
SELECT name AS [File Name] , physical_name AS [Physical Name], size/128.0 AS [Total Size in MB],
size/128.0 - CAST(FILEPROPERTY(name, 'SpaceUsed') AS int)/128.0 AS [Available Space In MB], [file_id]
FROM sys.database_files;

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating