Data Compression Double Take

  • Comments posted to this topic are about the item Data Compression Double Take

  • Have a bunch of MS Dynamics GP databases. The tables typically are substantially de-normalised (very "wide" tables of partially redundant data), have long fixed length character columns, and large numbers of decimal columns containing zeros. The databases and all their contained tables (some 1000 tables in a typical GP database) compress excellently using row compression; we haven't implemented page compression. In the large number of tables it appeared difficult to select appropriate candidates; it seemed easier to row compress all (except tables with a very low number of rows) and uncompress the exceptions if and when we found them. Have found no noticeably poor candidates as yet. In fairness to MS the table design is probably a legacy from the earlier multi-database support

    In terms of CPU usage I am assuming that SELECTs with a high number of logical reads (in the hundereds of thousands of rows) should show high CPU when compressed. The effect does not appear noticeable, and the server in any case currently has horsepower to spare. Any effect is dwarfed by other CPU effects we may have from custom code

    Since we are currently blessed with both sufficient RAM to contain the entire set of databases (nearly zero read IO from the smaller databases), and more recently some really fast write IO, our performance experience is a trifle difficult to compare

    Tony T

  • Just remember that a database encrypted with TDE does not compress all that much.:-D

    "Technology is a weird thing. It brings you great gifts with one hand, and it stabs you in the back with the other. ...:-D"

  • TDE occurs when the pages are written to disk or read from disk. This means that TDE has little to no impact on page or row compression. You are correct about compression at the file level though.

  • Compression to save disk space is of dubious value unless it saves IO.

    With technologies such as Fusion-IO and SSDs where you are talking about literally thousands of times that of mechanical disks IO performance does compression offer still offer benefits?

    Where does the bottleneck shift once mechanical disk IO is mitigated?

  • When disk I/O is no longer a problem, it comes to CPU and network I/O... To reduce the CPU load (and disk I/O) you could e.g. tune your queries / indexes, so expensive sorts occurse not so often. And / or you could let your application make the sorts on client side instead of querying a sorted result (of course this is nothing that could be implemented easy in a long existing application).

    The saved disk space on compressed tables can be a benefit for it alone, most databases have a small percentage of hot data and a big, whole heap of (c)old data. Even if I use partitioning and save the hot data on FusionIO disks, why should I let the cold data take 150 instead of 20 GB space on my hard disks, even if this data will be seldom queried. And do not forget, that a DBCC CHECKDB, a backup, a restore always has to access ALL data in a database and in this case it will save a lot of I/O if the data is much smaller.

    Regarding TDE: since SQL 2016 SP2 you can use backup compression together with TDE (the backup will read the data, decrypt it, compress it, encrypt it and writes it to the backup file). But only, if you set a MaxTransferSize of > 64k. But as mentioned above, TDE does not affect the PAGE / ROW compression of tables / indexes, since TDE kicks in on a deeper level (just before writing / after reading from disk, while the compressions happens when reading / writing to the Buffer Pool)

    God is real, unless declared integer.

  • David.Poole wrote:

    Compression to save disk space is of dubious value unless it saves IO.

    With technologies such as Fusion-IO and SSDs where you are talking about literally thousands of times that of mechanical disks IO performance does compression offer still offer benefits?

    Where does the bottleneck shift once mechanical disk IO is mitigated?

    "Literally thousands of times that of mechanical disks IO performance"?  Do you have some link with a demonstration of that as it pertains to SQL Server?  I've seen a fair number of systems go through a change from spinning rust to SSDs and none of them have even come close to "thousands of times" in the area of IO performance.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • David.Poole wrote:

    Compression to save disk space is of dubious value unless it saves IO.

    With technologies such as Fusion-IO and SSDs where you are talking about literally thousands of times that of mechanical disks IO performance does compression offer still offer benefits?

    Where does the bottleneck shift once mechanical disk IO is mitigated?

    I don't think fusion-io exist  as a company anymore.

  • Jeff Moden wrote:

    David.Poole wrote:

    Compression to save disk space is of dubious value unless it saves IO.

    With technologies such as Fusion-IO and SSDs where you are talking about literally thousands of times that of mechanical disks IO performance does compression offer still offer benefits?

    Where does the bottleneck shift once mechanical disk IO is mitigated?

    "Literally thousands of times that of mechanical disks IO performance"?  Do you have some link with a demonstration of that as it pertains to SQL Server?  I've seen a fair number of systems go through a change from spinning rust to SSDs and none of them have even come close to "thousands of times" in the area of IO performance.

    Yeah, it usually just made their locking and deadlocking problems more "tangible"

    _____________
    Code for TallyGenerator

  • This article was wrongly linked under the topic "Columnstore Indexes are Finally Sorted in SQL Server 2022" in the newsletter from 20th July 2022.

    I didn't find Steve Jones article with this name, but one from Brent Ozar, which should help too, if someone is looking for informations about ordered Columnstore Indexes: https://www.brentozar.com/archive/2022/07/columnstore-indexes-are-finally-sorted-in-sql-server-2022/

    God is real, unless declared integer.

  • This was removed by the editor as SPAM

  • Interesting proposition. I take it that the reference to real data being unavailable refers to the "CPU Cycles" aspect. There are some simple tests that could contribute to the conversation here: https://www.sqlservercentral.com/articles/the-real-world-implementing-data-compression-in-sql-server-2008-r2.

    Happy to see how this pans out with Red Gate's solution.

    Br. Kenneth Igiri
    https://kennethigiri.com
    All nations come to my light, all kings to the brightness of my rising

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply