SQL Server 2008 Compression

  • This is a great article. Highly informative and is helpful in a decision I'm looking to make currently about compression.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • sknox (4/22/2010)


    David Walker-278941 (4/22/2010)


    This is confusing: "As you can see from the disk bytes/sec counter, there is a considerable savings caused by both row and page level compression, with the page level compression yielding the highest improvement."

    How is this? The number of bytes per second is much lower with compression on. What is the "considerable savings"? A lower number of bytes per second is a bad thing, not a good thing. Can you explain this please?

    We're talking SQL Server performance, not disk performance. This is not how many bytes per second the disk array is capable of; this is how many bytes SQL Server had to read/write to disk during the testing -- and the less disk access needed, the better.

    This is also why (I believe) the compression results in faster backup and access times -- the bottleneck in this particular test is the disk array.

    Ah, yes, that does make sense. Still, it would be useful to know what the "units" are on the Time scale.

    I think that with a lot of SQL Server work, the disk subsystem is the bottleneck... not just in these tests.

  • David Walker-278941 (4/22/2010)


    sknox (4/22/2010)


    Ah, yes, that does make sense. Still, it would be useful to know what the "units" are on the Time scale.

    I think that with a lot of SQL Server work, the disk subsystem is the bottleneck... not just in these tests.

    Sorry - just noticed the time question. The tests I did were just short 3 min. tests, and those "units" are just what MS Excel decided I wanted, and I didn't feel like fighting. Looks like they are about 15 sec. intervals.

    Before you implemented anything, I'd suggest playing on a dev box with a copy of the database you are considering compressing with a more realistic load, and run tests for a lot longer.

    We are just getting ready to run live with our first 2008 box (all 2005 otherwise) with a brand new app, so for now I'm just doing backup compression and not DB compression. From what I can tell it's just as easy to implement compression later as it is initially (keeping in mind a short burst of CPU/Disk use will probably be involved).

    The Redneck DBA

  • David Walker-278941 (4/22/2010)


    sknox (4/22/2010)


    This is also why (I believe) the compression results in faster backup and access times -- the bottleneck in this particular test is the disk array.

    I think that with a lot of SQL Server work, the disk subsystem is the bottleneck... not just in these tests.

    Certainly, but as that's not always the case, and especially as encryption and compression have historically been compute-intensive tasks, I felt it was important to note that fact. As the disk subsystem is the bottleneck here, compression helps performance by decreasing reliance on that subsystem. But if you know that your disk subsystem has better performance characteristics -- or that your CPU has worse -- you should expect less dramatic results -- possibly even decreased performance in extreme cases.

  • Good article and glad to see numbers attached to it.

    One question, though, when you apply row or page compression is the table (and its respective indexes) still accessible or does it maintain some sort of lock on it. Reason for question is that my production OLTP database is in a 24x7 environment and we had no maintenance window.

    Mike Byrd

  • Mike Byrd (4/23/2010)


    Good article and glad to see numbers attached to it.

    One question, though, when you apply row or page compression is the table (and its respective indexes) still accessible or does it maintain some sort of lock on it. Reason for question is that my production OLTP database is in a 24x7 environment and we had no maintenance window.

    Good question...I hadn't even thought of that! I found this page that discusses it a little bit. Looks like you can do it online or offline just like rebuilding an index.

    http://msdn.microsoft.com/en-us/library/dd894051.aspx

    It also talks about some other side effects that I hadn't thought about. For example it looks like rebuilding indexes that are compressed takes longer than an uncompressed rebuild.

    The Redneck DBA

  • * To cover bases we need to add to this article performance data for compression + encryption. It needs to be more generic in terms of CPU I/O cycles to extrapolate performance data for a hardware set without having to run actual test.

    ** it needs to include data on performance of these features with multiple RAID arrays attached using

    a. SCSI

    b. iSCSI

    c. and (e)SATA

    In other words we need to understand saturation patterns with limited resources, rather than encouraging signs of how "well" feature fares under favorable circumstances.

    ** It would of being nice to have compression in SQL Server 2005 SP5.

    *** It would also be extremely nice to have backup automation + integration ( with compression to reduce I/O ). Just like with SQL Server 2005 we were introduced automatic trace ( 2008 R2 SP3? ) Integrating this feature for continuous backup and near instantaneous recovery. Microsoft DPM product could showcase this technology and make it a "must have" for 3rd party vendors.

    ***

    compression of trace data and broker messages?

  • Miron Berlin (4/24/2010)


    * To cover bases we need to add to this article performance data for compression + encryption. It needs to be more generic in terms of CPU I/O cycles to extrapolate performance data for a hardware set without having to run actual test.

    ** it needs to include data on performance of these features with multiple RAID arrays attached using

    a. SCSI

    b. iSCSI

    c. and (e)SATA

    In other words we need to understand saturation patterns with limited resources, rather than encouraging signs of how "well" feature fares under favorable circumstances.

    Yes, you would certainly want to do your own testing in your own environment rather than relying on the results presented here before making any decisions. Also wonder what would happen to the results if there was a SAN in the mix. Not sure how encryption got slipped in...I wasn't including that in the scope of this article at all (at least not intentionally).

    ** It would of being nice to have compression in SQL Server 2005 SP5.

    SP5?

    The Redneck DBA

  • Can anyone confirm whether Backup Compression is included in the new SQL 2008 Standard R2, or is it still just for Enterprise? This really should be offered Standard.

  • JohnSQLServerCentral (5/6/2010)


    Can anyone confirm whether Backup Compression is included in the new SQL 2008 Standard R2, or is it still just for Enterprise? This really should be offered Standard.

    The Microsoft literature says that Standard edition now gets backup compression. From this link: http://www.microsoft.com/sqlserver/2008/en/us/R2-editions.aspx

    "The following capabilities are new to Standard:

    Backup Compression to reduce data backups by up to 60% and help reduce time spent on backups *

    Can be managed instance for Application and Multi-Server Management capabilities

    High-scale complex event processing with SQL Server StreamInsightâ„¢"

  • Jason, thanks for this great article, I can't wait to get our production server upgraded so we can actually start using some of these features!

  • Jason Shadonix (4/22/2010)


    dma333 (4/22/2010)


    Great article. ... Also, when backing up a compressed table to a compressed backup, is that faster than an uncompressed table to a compressed backup (since it's already compressed)?

    ....

    Good question on the compressed backups of compressed databases. I hadn't thought of that. Since my test database contained a combination of compressed and uncompressed tables, not sure of the answer to your question. Should be fairly easy to test though.

    Good article.

    edit: bits struck through below because now I've done some tests instead of believing what I read and what I was told, which turned out to be incorrect when i tested it.

    On the compressed backup of a compressed table question: no, the compression algorithms/formats are different, so backing up a compressed table/index involves decompressing and then recompressing; so the CPU impact is higher than you might expect. But yes, you do save disc activity on both the source and the destination, and since the backup is usually disc bound this should bring the elapsed time down and also mitigate the CPU hit because there is less IO to handle.

    I haven't done any testing myself, so can't be completely certain, and of course the outcome will depend on configuration, but people have told me that compressed backups of compressed data and indices gives even more elapsed time saving than compressed backups of non-compressed tables/indices for not much more CPU cost. But it all depends on your individual system - for example if you are using SSDs for both the DB and the backups the elapsed time saving from compression will be far smaller that if you are using slow and clunky cheap discs; but then the cost reduction from needing less space might be more important, as SSDs are still not as cheap as HDDs.)

    Tom

  • I was an early user of SQL LiteSpeed. Yes you are correct, the less IOs offerred by compression resulted in tremendously less backup and restore times. Although depending on environment and hardware we generally saw backup times about 1/3rd of uncompressed backups and restore time generally about 1/2 of uncompressed backups. I saw absolutely no downside to compressed backups.

    Mike Byrd

  • Tom.Thomson (7/26/2011)


    Good article.

    On the compressed backup of a compressed table question: no, the compression algorithms/formats are different, so backing up a compressed table/index involves decompressing and then recompressing; so the CPU impact is higher than you might expect.

    Do you know for certain that a backup uncompresses and then recompresses compressed tables? I thought it just backed up the data pages as they were on the disk, and applied compression to them. (So there shouldn't be any uncompression CPU overhead during a backup.)

  • UMG Developer (7/26/2011)


    Tom.Thomson (7/26/2011)


    Good article.

    On the compressed backup of a compressed table question: no, the compression algorithms/formats are different, so backing up a compressed table/index involves decompressing and then recompressing; so the CPU impact is higher than you might expect.

    Do you know for certain that a backup uncompresses and then recompresses compressed tables? I thought it just backed up the data pages as they were on the disk, and applied compression to them. (So there shouldn't be any uncompression CPU overhead during a backup.)

    Good question. No, I didn't know for certain. I read it somewhere, but I didn't remember whether that was somewhere I should trust. Someone told me it, and I believed it. :blush: Mistake! I've added a correction to the post you quoted.

    Your question triggered me to do some experiments. As a result I'm now actually certain that backup does not decompress (and if I ever remember where I picked up that misinformation I will add the place to my distrusted sources list). So we have another example of the inaccuracy of the statement in BoL that any edition can restore a compressed backup: if the database used page or row compression, standard edition restore will fail in the recovery stage (just as it does if the database uses data encryption or any other enterprise edition only feature).

    I built several databases specifically to test this issue, creating both data which would be susceptible to row and page compression and data that would not but would be susceptible to more general compression. The result makes it pretty clear that backup does no decompression.

    Here's a test script (It uses a database with simple recovery model [so I can ensure none of the backups is swamped by log data by using checkpoint] called playpen which has a 11001 entry Tally table called Tally in it; I haven't included steps to create that, as it's irrelevant; also the script assumes access to a directory c:\playpen to store the test backups in). It creates and populates a table, does four different backups, and displays information on the resulting backup sets (assuming no other backups happen while the test is running) which makes it very clear that backup does not decompress.

    use playpen

    go

    create table comptests (

    A bigint not null identity(144115188075855872,-3) primary key clustered,

    B varchar(108) not null default (''),

    C varchar(108) not null default (''),

    D varchar(108) not null default (''),

    E varchar(108) not null default (''),

    F varchar(108) not null default (''),

    G varchar(216) not null default(''),

    H varchar(216) not null default(''),

    I int not null default(0),

    J int not null default(0),

    K int not null default(0),

    L bigint not null default(0))

    select top 4000 identity(bigint, 284135188075895873,-113) as X

    into #t

    from master.sys.all_columns T1, master.sys.all_columns T2

    select top 4000 IDENTITY(int,410002,37) as Y

    into #u

    from master.sys.all_columns T1, master.sys.all_columns T2

    select top 4000 IDENTITY(int, 2000000000, -97) as Z

    into #v

    from master.sys.all_columns T1, master.sys.all_columns T2

    create table #w1 (Y int, II int identity(1,1) primary key)

    insert #w1(Y) select Y from #u order by Y desc

    create table #w2 (Z int, IJ int identity(1,1) primary key)

    insert #w2(Z) select Z from #v order by Z asc

    create table #w3 (X bigint not null, IH int identity(1,1) primary key)

    insert #w3(X) select X from #t order by X desc

    declare @x varchar(108) = cast(newid()as char(36))+cast(newid()as char(36))+

    cast(newid()as char(36))

    declare @y varchar(108) = cast(newid()as char(36))+cast(newid()as char(36))+

    cast(newid()as char(36))

    drop table #t, #u, #v

    ;with WW as (select X, Y, Z

    from #w1, #w2, #w3

    where II=IH and IJ=IH)

    insert comptests(L,K,J,I,H,G,F,E) select X,Y,Z,Z%Y,

    case(X%2) when 0 then @y else @x end,

    case(X%2) when 1 then @x else @y end,

    case(X%2) when 0 then @x else @y end,

    case(X%2) when 1 then @y else @x end

    from WW order by X desc

    drop table #w1,#w2,#w3

    update comptests set

    B = SUBSTRING(E,1,54)+SUBSTRING(F,1,54),

    C = SUBSTRING(E,1,54)+SUBSTRING(G,1,54),

    D = SUBSTRING(E,1,54)+SUBSTRING(H,1,54)

    select top 10 * from comptests

    checkpoint

    checkpoint

    backup database playpen to DISK='c:\playpen\PP_NONE_NO.bak'

    with no_compression, init

    checkpoint

    backup database playpen to DISK='c:\playpen\PP_NONE_YES.bak'

    with compression, init

    alter table comptests REBUILD WITH (DATA_COMPRESSION = PAGE)

    alter table Tally REBUILD WITH (DATA_COMPRESSION = PAGE)

    checkpoint

    checkpoint

    backup database playpen to DISK='c:\playpen\PP_PAGE_NO.bak'

    with no_compression, init

    checkpoint

    backup database playpen to DISK='c:\playpen\PP_PAGE_YES.bak'

    with compression, init

    checkpoint

    go

    use msdb

    go

    SELECT TOP 4 [backup_set_id]

    ,[backup_size]

    ,[compressed_backup_size]

    ,[media_set_id]

    ,[first_lsn]

    ,[last_lsn]

    ,[checkpoint_lsn]

    ,[database_backup_lsn]

    ,[backup_finish_date]

    ,[recovery_model]

    FROM [msdb].[dbo].[backupset]

    order by [backup_set_id] desc

    go

    use playpen

    go

    drop table comptests

    alter table tally REBUILD WITH (DATA_COMPRESSION = NONE)

    checkpoint

    go

    The table is not very big (a few Mb uncompressed) - the script populates it on my aging laptop in about 0 elapsed time (backups and compression take a bit longer).

    Tom

Viewing 15 posts - 16 through 29 (of 29 total)

You must be logged in to reply to this topic. Login to reply