Blog Post

SQL 2008 Row and Page compression – or SQL 2005 (post SP2) vardecimal conversion


One of the great new features in SQL 2008 is Row and/or Page Compression. Plus, still good news for those of you on SQL 2005 in production who might be there for a long while: there’s a decent feature you can take advantage of too. How? With compression in SQL 2008 Developer or Entreprise edition, you will want to benefit from this mega space saver (like 1.3TB down to 650GB as I have seen, plus queries running at as little as 40% the time they took before (after space optimisation and defragmentation). The equivalent, although with less options, is VARDECIMAL conversion in SQL 2005 (pre-requisite of SP2 on Entreprise). Both have stored procedures that you can run to estimate how much space you will save: In 2008 it's sp_estimate_data_compression_savings -- which randomly takes rows and gives you quite an accurate estimation of space saved/gained.  Set your statistics and I/O on just for details of the page reads while doing the comparison before and after. For reference, see Brad's [McGehee] Compression examples.

In SQL 2005 Entreprise/Developer, from sp2 onwards, you can do this:
exec sys.sp_db_vardecimal_storage_format ''DatabaseName'', ''ON''

-- exec sys.sp_db_vardecimal_storage_format ''DatabaseName'', ''OFF'' -- rollback

exec sp_tableoption ''dbo.BigAssTableLoadedWithDecimals'', ''vardecimal storage format'', 1

-- exec sp_tableoption ''dbo.BigAssTableLoadedWithDecimals'', ''vardecimal storage format'', 0 -- rollback


0 can be replaced with the amount of space you want to leave free for expansion, normally if you have huge decimal types in tables with millions or rows, you'll see a huge difference in size of the db after the shrink.
For more details please see:

Please take care while you are doing your big table conversions and make sure that you have enough space for the whole table to be added onto the MDF file and relatively heavy load on the LDF [log] data file. What you can do, as a temporary solution, is switch to bulk mode (set recovery level) while you are doing the compression (not forgetting to take a full backup of course please), then re-enable Full/Bulk or whatever recovery mode you were using before.
Basically what you are doing is clamping down on the wasted space per line with respect to Row level compression in SQL 2008 (characters even get dictionary compression, and also prefix compression - see here for all the types), which is what SQL 2005 SP2 calls vardecimal although limited to decimal data type only. Obviously, the 2008/5 Entreprise Edition has upped the cost for you to have this functionality, but it goes with the territory – a large organisation has major space to gain usually, and don’t forget that it’s not only the space in the database you are gaining, it’s all the cumulative space in backups saved over time you'll save too, plus and all the increased time (performance) for execution of queries. One could argue simply that by taking the huge number of times a query costs in CPU time, multiplied by its frequency of execution to get the CPU hours saved. Imagine? 'Hey boss, where's my bonus, I just saved the company n,000 hours of CPU execution tme:)' ; remember that clients can get their data from reports in less than half the time!

You could do this in the temp database also, rebuild an object using the temp while adding the compression to the specific object during creation, take care to make sure your temp is as big as the object you are rebuilding– such as the index+table data size, with a good fifty percent on top of that to be sure you’re not going to run out of data space for other applications. Some organisations do not use the Temp at all, while others, I have seen, are completely dependent on the TempDB...they have reserved physical disks for Temp due to intensive operations.

The Data Compression Wizard in SQL Server Management studio demystifies much of this, if you are already lost. Mind you, it’s not very efficient if you have to do this on a whole bunch of objects, in that case you would be better off using T-SQL. Sort your tables by the largest first (data pages should be in the thousands), and evaluate where the space savings lies. If you are in the millions of data pages, use this compression – as mentioned before even in SQL 2005 post Service Pack 2 build you can take advantage of row compression by means of VARDECIMAL (but not page compression sadly).

Another condition you should be aware of, If you are input/output bound on your system, meaning that you are waiting on your disks, then you can benefit from compression – Brad McGehee has stated that he prefers Row compressio
n for OLTP environments (mentioned early December '08 at SQLTeach in Montreal) – and if the e-DBA guru's mentioning it, then it’s really worth looking into.

 If you are using tables with multiple non-clustered indexes, only compress those indexes that are used occasionally. Heavily used indexes are to be avoided therefore – so, as always, to be sure TEST, TEST, TEST....on several servers, your test environment, your disaster recovery environment, your dev., and even your laptop (still enjoying disk performance for databases on Dell’s XPS Samsung SSD – holding back on a 32GB SSD express card for logs/swap/temp in the New Year). Do it on your development environment just to save space on large tables – b/c Dev environments are usually starved for data, and then just watch what happens over the next weeks....check and see if queries are taking forever/faster to run, and let a few of the developers know progressively – maybe they’ll see something you missed out.  SQL DBAs - get out there and compress, it's your I/O, performance and space saving duty!


Page Compression:
Compression Strategies: