• PiMané (9/28/2016)


    Eric M Russell (9/28/2016)


    Data compression is mostly about optimizing storage. Page Compression will yield approximately 70% compression in tables and indexes. By reducing the number of page reads and page buffers, it can potentially improve the performance of queries, but to what degree compression results in improvements in query performance depends. Queries that can across a large number of pages, something like full or partial table/index scans, will benefit most, but even then it probably won't be a 70% improvement.

    Data compression works best for tables that are mostly inserted and selected. If you update compressed data, especially if you're updating primary keys, then compression can actually degrade performance significantly due to page splitting.

    I know that a compression of 70% doesn't imply a performance improvement of 70%...

    Data compressed => less reads => less IO (where almost the bottleneck is...).

    The primary key is never updated (identity seed) and the data if often updated... The data is selected by a reference field introduced by the user... That's why the PK has many writes and almost no reads... The auxiliary index (user + reference) has lots of reads and no writes (updates).

    Any table or index whose fields are updated often won't benefit from compression, right?

    Thanks,

    Pedro

    When the size of a previously inserted row increases in length (like when it's updated), the row must be marked as deleted and moved to another location in the page. If the size of data within a page is updated in a way that prevents all the rows from being re-written to the size 8k page, then SQL Server must append a new page and split rows between them. This is more problematic when pages are compressed, because the compression ratio for a page depends on how the data is populated. For example, if you are inserting rows with NULL columns, and then going back later and updating those rows with values, then that data is less sparse than it was when initially inserted, the compression ratio will drop, and you'll see page splits.

    It's basically the same page splitting issue we've always had with non-compressed data, only compression compicates it. But it can be mitigated by specifying a FILLFACTOR between something like 70 and 90 and setting PAD_INDEX = ON.

    https://blogs.msdn.microsoft.com/sqlserverstorageengine/2011/05/12/data-compression-and-fill-factor/

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho