Now, why in the world would I be asking a question like this?
Well, in BOL (BACKUP) is this little blurb:
Specifies that the backup operation will verify each page for checksum and torn page, if enabled and available, and generate a checksum for the entire backup. This is the default behavior for a compressed backup.
Using backup checksums may affect workload and backup throughput.
- (Emphasis mine)
However, when performing a compressed backup and examining the results in msdb.dbo.backupset, the has_backup_checksums column is not set to 1. Let’s run through a quick example to see what I’m talking about.
First, let’s create a database and a table, and populate it with a million rows:
USE master; GO -- drop the database if it already exists IF DB_ID('BackupTest') IS NOT NULL DROP DATABASE BackupTest; GO -- make a database. make a table. populate the table. CREATE DATABASE BackupTest; GO USE BackupTest; GO CREATE TABLE dbo.Tally ( N INTEGER CONSTRAINT PK_Tally PRIMARY KEY CLUSTERED ); WITH Tens (N) AS (SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1), Hundreds(N) AS (SELECT 1 FROM Tens t1, Tens t2), Millions(N) AS (SELECT 1 FROM Hundreds t1, Hundreds t2, Hundreds t3), Tally (N) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM Millions) INSERT INTO dbo.Tally (N) SELECT N FROM Tally; GO
Now, let’s perform various full database backups on this database.
USE master; GO -- ensure that the backup directory exists EXECUTE xp_create_subdir 'C:\BackupTest'; -- plain full backup, no options BACKUP DATABASE BackupTest TO DISK = 'C:\BackupTest\BackupTest1_FULL_NoOptions.bak'; GO -- full backup with checksums BACKUP DATABASE BackupTest TO DISK = 'C:\BackupTest\BackupTest2_FULL_CHECKSUM.bak' WITH CHECKSUM; GO -- compressed backup BACKUP DATABASE BackupTest TO DISK = 'C:\BackupTest\BackupTest3_FULL_COMPRESSION.bak' WITH COMPRESSION; GO -- compressed backup, no checksums BACKUP DATABASE BackupTest TO DISK = 'C:\BackupTest\BackupTest4_FULL_COMPRESSION_NOCHECKSUM.bak' WITH COMPRESSION, NO_CHECKSUM; GO -- compressed backup, checksums BACKUP DATABASE BackupTest TO DISK = 'C:\BackupTest\BackupTest5_FULL_COMPRESSION_CHECKSUM.bak' WITH COMPRESSION, CHECKSUM; GO
Now let’s looks at the results of these backups as stored in msdb:
-- show the backups that were performed along with relevant information SELECT [DATABASE name] = database_name, [backup start date] = backup_start_date, [has backup checksums] = has_backup_checksums, [backup size] = backup_size, [compressed backup size] = compressed_backup_size FROM msdb.dbo.backupset WHERE database_name = 'BackupTest' AND backup_start_date > DATEADD(DAY, DATEDIFF(DAY, '2000-01-01', GETDATE()), '2000-01-01') ;
For the backups I just performed, this shows (I’ve manually added the last column):
|database name||backup start date||has backup checksums||backup size||compressed backup size||options used|
|BackupTest||2013-12-18 14:34:50.000||0||17015808||2667990||COMPRESSION, NO_CHECKSUM|
|BackupTest||2013-12-18 14:34:50.000||1||17015808||2518869||COMPRESSION, CHECKSUM|
Let’s look at what the actual backup files show…
-- show the backup file headers RESTORE HEADERONLY FROM DISK = 'C:\BackupTest\BackupTest1_FULL_NoOptions.bak'; RESTORE HEADERONLY FROM DISK = 'C:\BackupTest\BackupTest2_FULL_CHECKSUM.bak'; RESTORE HEADERONLY FROM DISK = 'C:\BackupTest\BackupTest3_FULL_COMPRESSION.bak'; RESTORE HEADERONLY FROM DISK = 'C:\BackupTest\BackupTest4_FULL_COMPRESSION_NOCHECKSUM.bak'; RESTORE HEADERONLY FROM DISK = 'C:\BackupTest\BackupTest5_FULL_COMPRESSION_CHECKSUM.bak';
Which returns the same results as the query against the msdb.dbo.backupset table.
So, where does this leave us? Are checksums being performed with a compressed backup? By looking at only the has_backup_checksums column, then it appears the answer is no. Even when you also look at the compressed backup size, any compressed backup without the checksum option gives file sizes of the same relative size (I show one byte difference). So, we appear to have a discrepancy between the documented behavior in BOL, and what is actually being observed.
So the real question becomes: Does performing a compressed backup provide the same usefulness as when specifying the checksum option? As it turns out, after I started writing this blog post (and before I finished it), Brent Ozar had the same question. Doing something that I hadn’t considered, he pulled out his hex editor and… well, read about it here. The quick summary is: if you don’t specify the checksum option, then page checksums / torn pages aren’t being checked for. Which means that BOL is wrong. (Thanks Brent!)
It is my opinion that when the compressed backup is occurring, it calculates an internal checksum for the data being backed up (a calculation based upon the data being backed up, NOT verifying the page checksum values previously calculated or checking for torn pages like the checksum option for the backup command does). I think that many years ago, back when backup compression was being added to SQL 2008, the BOL writer saw “checksum is calculated during a compressed backup”, and wrongfully wrote up that the checksum option is the default for a compressed backup. Again, this is just my opinion, and I have no facts to back this up.
Since it is a good practice to utilize the checksum option when doing backups (to ensure that there isn’t any corruption from the IO subsystem), you should know that you cannot perform this from a maintenance plan – that option isn’t there. I was really hoping that the compressed backups did do the checksums, as it would be easy to implement at all of the clients that still use maintenance plan (since making a compressed backup is an option in the maintenance plan). So you will need to use a solution based upon executing the T-SQL statement to use the checksum option (such as Ola Hallengren’s free solution). This is just another reason why using a script-based solution is better than using a maintenance plan.
If you made it this far, I hope that you will visit this connect item (Backup COMPRESSION enables CHECKSUM?) to vote on getting this fixed.