Page/Row DEcompression, when does it occur?

  • I can't seem to find the answer to this seemingly valid question which rather hints that the answer is obvious. Documentation states that compression reduces storage size and physical reads, but never really states when SQL Server decompresses a page or row during a SQL query.

    I got curious because during a recent disussion a fellow colleague claimed "using a bigint for an int value doesn't matter since its all compressed to X byte anyway" during a debate about table design and datatypes which is obviously irrelevant since one is a design issue in the universe of discourse and the other an installation configuration - but I realised my indepth knowledge of compression was lacking.

    Obviously compression increases the number of rows we can have in a given page which reduces IO, but what I really want to know is will it increase the number of rows of a compressed index that that will fit in a CPU cacheline during a nested loop join? Is it a strict No to the question if page/row compression can reduce cache-misses during join operator executions?

    If decompression occurs as soon as a page enters the buffer pool then the answer is no of course. Maybe that is the obvious answer I am looking for. I realize the answer might be different or not as clean cut for page/row compression. I'll be very grateful for anyone that can enlighten me.

    Regards

     

  • https://docs.microsoft.com/en-us/previous-versions/sql/sql-server-2008/dd894051(v=sql.100)

    "

    Compressed pages are persisted as compressed on disk and stay compressed when read into memory. Data is decompressed (not the entire page, but only the data values of interest) when it meets one of the following conditions:

    It is read for filtering, sorting, joining, as part of a query response.

    It is updated by an application.

    There is no in-memory, decompressed copy of the compressed page. Decompressing data consumes CPU. However, because compressed data uses fewer data pages, it also saves:

    Physical I/O: Because physical I/O is expensive from a workload perspective, reduced physical I/O often results in a bigger saving than the additional CPU cost to compress and decompress the data. Note that physical I/O is saved both because a smaller volume of data is read from or written to disk, and because more data can remain cached in buffer pool memory.

    Logical I/O (if data is in memory): Because logical I/O consumes CPU, reduced logical I/O can sometimes compensate for the CPU cost to compress and decompress the data.

    "

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply