• UMG Developer (7/26/2011)


    Tom.Thomson (7/26/2011)


    Good article.

    On the compressed backup of a compressed table question: no, the compression algorithms/formats are different, so backing up a compressed table/index involves decompressing and then recompressing; so the CPU impact is higher than you might expect.

    Do you know for certain that a backup uncompresses and then recompresses compressed tables? I thought it just backed up the data pages as they were on the disk, and applied compression to them. (So there shouldn't be any uncompression CPU overhead during a backup.)

    Good question. No, I didn't know for certain. I read it somewhere, but I didn't remember whether that was somewhere I should trust. Someone told me it, and I believed it. :blush: Mistake! I've added a correction to the post you quoted.

    Your question triggered me to do some experiments. As a result I'm now actually certain that backup does not decompress (and if I ever remember where I picked up that misinformation I will add the place to my distrusted sources list). So we have another example of the inaccuracy of the statement in BoL that any edition can restore a compressed backup: if the database used page or row compression, standard edition restore will fail in the recovery stage (just as it does if the database uses data encryption or any other enterprise edition only feature).

    I built several databases specifically to test this issue, creating both data which would be susceptible to row and page compression and data that would not but would be susceptible to more general compression. The result makes it pretty clear that backup does no decompression.

    Here's a test script (It uses a database with simple recovery model [so I can ensure none of the backups is swamped by log data by using checkpoint] called playpen which has a 11001 entry Tally table called Tally in it; I haven't included steps to create that, as it's irrelevant; also the script assumes access to a directory c:\playpen to store the test backups in). It creates and populates a table, does four different backups, and displays information on the resulting backup sets (assuming no other backups happen while the test is running) which makes it very clear that backup does not decompress.

    use playpen

    go

    create table comptests (

    A bigint not null identity(144115188075855872,-3) primary key clustered,

    B varchar(108) not null default (''),

    C varchar(108) not null default (''),

    D varchar(108) not null default (''),

    E varchar(108) not null default (''),

    F varchar(108) not null default (''),

    G varchar(216) not null default(''),

    H varchar(216) not null default(''),

    I int not null default(0),

    J int not null default(0),

    K int not null default(0),

    L bigint not null default(0))

    select top 4000 identity(bigint, 284135188075895873,-113) as X

    into #t

    from master.sys.all_columns T1, master.sys.all_columns T2

    select top 4000 IDENTITY(int,410002,37) as Y

    into #u

    from master.sys.all_columns T1, master.sys.all_columns T2

    select top 4000 IDENTITY(int, 2000000000, -97) as Z

    into #v

    from master.sys.all_columns T1, master.sys.all_columns T2

    create table #w1 (Y int, II int identity(1,1) primary key)

    insert #w1(Y) select Y from #u order by Y desc

    create table #w2 (Z int, IJ int identity(1,1) primary key)

    insert #w2(Z) select Z from #v order by Z asc

    create table #w3 (X bigint not null, IH int identity(1,1) primary key)

    insert #w3(X) select X from #t order by X desc

    declare @x varchar(108) = cast(newid()as char(36))+cast(newid()as char(36))+

    cast(newid()as char(36))

    declare @y varchar(108) = cast(newid()as char(36))+cast(newid()as char(36))+

    cast(newid()as char(36))

    drop table #t, #u, #v

    ;with WW as (select X, Y, Z

    from #w1, #w2, #w3

    where II=IH and IJ=IH)

    insert comptests(L,K,J,I,H,G,F,E) select X,Y,Z,Z%Y,

    case(X%2) when 0 then @y else @x end,

    case(X%2) when 1 then @x else @y end,

    case(X%2) when 0 then @x else @y end,

    case(X%2) when 1 then @y else @x end

    from WW order by X desc

    drop table #w1,#w2,#w3

    update comptests set

    B = SUBSTRING(E,1,54)+SUBSTRING(F,1,54),

    C = SUBSTRING(E,1,54)+SUBSTRING(G,1,54),

    D = SUBSTRING(E,1,54)+SUBSTRING(H,1,54)

    select top 10 * from comptests

    checkpoint

    checkpoint

    backup database playpen to DISK='c:\playpen\PP_NONE_NO.bak'

    with no_compression, init

    checkpoint

    backup database playpen to DISK='c:\playpen\PP_NONE_YES.bak'

    with compression, init

    alter table comptests REBUILD WITH (DATA_COMPRESSION = PAGE)

    alter table Tally REBUILD WITH (DATA_COMPRESSION = PAGE)

    checkpoint

    checkpoint

    backup database playpen to DISK='c:\playpen\PP_PAGE_NO.bak'

    with no_compression, init

    checkpoint

    backup database playpen to DISK='c:\playpen\PP_PAGE_YES.bak'

    with compression, init

    checkpoint

    go

    use msdb

    go

    SELECT TOP 4 [backup_set_id]

    ,[backup_size]

    ,[compressed_backup_size]

    ,[media_set_id]

    ,[first_lsn]

    ,[last_lsn]

    ,[checkpoint_lsn]

    ,[database_backup_lsn]

    ,[backup_finish_date]

    ,[recovery_model]

    FROM [msdb].[dbo].[backupset]

    order by [backup_set_id] desc

    go

    use playpen

    go

    drop table comptests

    alter table tally REBUILD WITH (DATA_COMPRESSION = NONE)

    checkpoint

    go

    The table is not very big (a few Mb uncompressed) - the script populates it on my aging laptop in about 0 elapsed time (backups and compression take a bit longer).

    Tom