Do you know much about data compression? It seems that quite a few people I meet don’t really understand how it works, or what the benefits are. I heard about compression first hand from Sunil Agarwal a few years back at TechEd. I visited one of his sessions and then he was gracious enough to spend about 15-20 minutes with me afterwards answering more questions about additional details. At the time, I was just another schmuck in the audience, but Sunil was very pleasant and patient with me.
I was lucky that day,but Sunil has a video where he explains some of the ways in which compression works for Microsoft customers.
The summary is that data compression works in a few ways. One is, obviously, to save space. The second, however, is that fewer I/Os’s are required to move data on or off a disk, which can result in a performance increase. The SQL Server team tried to strike a balance in how they implemented compression to allow for both of these benefits. The tradeoff is that CPU is needed to compress or decompress data, so while more compression is possible, it can negatively impact performance.
Compression is a cool feature, and consists of a few parts. You can read more about each of them if you are interested:
- Row compression
- Prefix compression (part of page compression)
- Dictionary compression (also part of page compression)
- Unicode Compression (SQL Server 2008 R2 only)
The DBA has complete control over what is compressed, but it is also some work to analyze the space savings and then make a determination about whether or not it is worth you compressing your table. If the space savings is too small, you might end up negating any storage gains with additional CPU costs.
How Do I Decide?
There is a stored procedure (sp_estimate_data_compression_savings) that can be used on tables or indexes to help you determine what the savings are. Numerous scripts are available, like this one from Paul Neilsen – Whole Database – Data Compression Procs.
There’s also one easy decision. Only Enterprise and Data Center editions of SQL Server support this feature, so if you aren’t running those, you don’t have to worry about this.
There are a couple other alternatives to implementing data compression. If you want to get compression savings in other editions of SQL Server, my company, Red Gate Software, makes a product called SQL Storage Compress, which implements compression at the file level.
This is based on Hyperbac technology and is completely transparent to SQL Server. We are working on some extensive case studies right now and this product is part of the SQL Server I/O Reliability Program. This can help performance on your system by reducing I/O.
If space is an issue in test, customer service, and development environments, you can use Virtual Restore to mount a backup file as a database that you can read, or write to. The writes occur in a sparse file, so your backup file is untouched. However you get a quicker “restore” that is transparent to SQL Server.
As I mentioned, I work for Red Gate software, so take these as mentions of products my company builds and sells. I’ve heard great things about them, and use them in a few places for work at SQLServerCentral. I’d recommend you try them and see if they are a fit in your environment and have a good price/value ratio for your company.