Blog Post

Data compression

,

This post examines the compression of a database named Bankware (BW) at a client site. The database size was about 300+ GB.
The tests for BW were done on the sk-bankware server aimed for the ART (Accounting, Reporting and Treasury) applications in the Skopje regional office of the client. The sk-bankware server owns the following HW&SW:
CPU: NUMA 2 x 12 = 24 coresRAM: 32 GB (28 GB dedicated to the SQL Server instance)Storage: RAID5 disks 2.5 TBSQL Server: Microsoft SQL Server Enterprise Edition (64-bit)OS: Windows server 2008 R2 Standard edition
Short introduction to compression
Data compression in SQL Server can be of two types: PAGE and ROW. The PAGE type compression is more common for the OLAP-like systems and the ROW type is more common for the OLTP-like systems. As the PAGE type is stronger than ROW, it requires the CPU to do more compressing and decompressing of the data, and it’s one reason why PAGE is usually used in systems where there is no much data update (update, delete, insert), or it is less frequently done.
Data compression is recommended by Microsoft for both OLTP and OLAP systems. In the following white paper they show the benefits of the compression as an advanced feature. Then, SAP by default use ROW type of compression for all their applications (blog white paper 1, white paper 2). Another satisfying story of this feature is the following of NetApp.
Conclusion of the Microsoft’s white paper for the compression is that it’s safe and useful feature for all systems in general.
Advantages and disadvantages
In order to check and ensure that the recommendations for applying data compression on systems is working well for Bankware, some tests were done on it.
There are advantages and disadvantages of applying compression on BW. If the advantages are high and disadvantages are low or negligible, then the tradeoff is warrantable and the feature is useful for the system where it’s applied. If the disadvantages’ impact is significant on the system, then the feature should not be applied to the system.
Advantages identified:
Significantly reduced storage for the dataSignificantly reduced number of data pages in the memory buffer of the SQL Server instanceSignificantly reduced I/O transferFaster queries
Disadvantages identified:
CHECKDB is expected to last moreREBUILD indexes operation is expected to last moreSome other maintenance operations (update stats, reorganize, …) are expected to last more
Results and tradeoff of the compression
The PAGEtype compression was applied on Bankware. Sometimes the PAGE and ROW types of compression don't differ much in size. However PAGE is more recommended for OLAP-like systems. The decision for the PAGE type was chosen additionally on the columns definitions analysis. The overview showed that the char-type columns with their definitions dominate in terms of the storage required for storing the data. 
Data storage, I/O performance counters and Queries execution
Data storage.

The gain of the applied compression is shown on Figure 1.

Figure 1. Data storage for compressed and uncompressed Bankware
The storage gain by applying compression usually flows between 30-90%. An OLAP system can be compressed up to 90%, whereas the OLTP systems are usually compressed within the range of 30-50%. A compression rate of 60% for our Bankware is very satisfying.
I/O performance counters. The I/O benefit is significant. A number of 21 I/O performance counters were monitored for a simulating multi-session environment during an intensive 30 minute time interval. Their values are showed in the table below.

Table: I/O Performance counters
Results are improving.
Queries speed and execution. The queries that execute behind the following reports were run on a compressed and uncompressed version of the Bankware database with equal overloading conditions:
Reporting->Reports->Current Internal BalanceReporting->Reports->StatementsReporting->Entry analysis->Ledger entriesReporting->Entry analysis->Subledger entriesReporting->Reports->Generate report
The number of queries that took part in the test was around 170. Aggregated results for the runs of the two groups of queries are shown in the next Figure 2.

Figure 2. Execution on queries on a compressed and uncompressed Bankware database
The gain for the queries execution speed is 23%. This is a good improvement too.
CHECKDB
CHECKDB is a maintenance command that is used at Bankware on a weekly basis. The results were expected to be lower for the compressed database because of the compressed data; and in that case more operations are done in the CPU, like decompression, check and compression operations. When the CHECKDB is done on uncompressed only the check operation is done by the CPU. Results are shown in the next Figure 3.

Figure 3. Results for the CHECKDB on compressed and uncompressed BW database
Whilst the CHECKDB for uncompressed database finished in 1995 seconds (33 min and 15 sec), the same operation for the compressed database lasted for 2476 seconds (41 min and 16 sec). The time duration for this operation was about 20% more on the compressed database.
REBUILD
Test for the indexes maintenance was done too. Less page splits will occur in the new index structure on the compressed database but more CPU time will be spent for the compression and decompression during the update-type operations. Results are shown in Figure 4.

Figure 4. Results for the REBUILD operation for compressed and uncompressed BW database
The rebuild time for the indexes does not differ much, i.e. the difference is about 300 seconds or 5 minutes, which compared to 1 hour and 30 minutes (approx) is practically not a difference.
Final words
A conclusion is that compression as an advanced feature in SQL Server is fitting very good on the Bankware database and therefore it should be applied. The tradeoff between the advantages and disadvantages is positive for the advantages. Disadvantages are such small that they are almost not to be noticed.
Another reason for the appliance of the compression on Bankware is that it’s a system which is not a a write-intensive during the working hours. Quite a good benefit comes for the reports which are used during the working hours.

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating