SQLServerCentral Article

Differential Backups Internals

,

One of the fundamental questions I ask perspective DBA candidates is: "Can you perform a differential backup under simple model?" This is a 50/50 chance of success, but yet the majority of candidates get this wrong. For those that get it right I follow up with "How does SQL make this possible?" Only one candidate has ever successfully answered this question. This article explains how differential backups occur under the covers, with the help of the dbcc page command.

If you're following along with this article, put your Adventureworks database in simple recovery mode, then perform a full backup for a clean start. Select a table and use the undocumented command dbcc ind command.

dbcc ind(adventureworks, [Production.TransactionHistory], 1)

What your looking at here is list of datapages for index 1, or the clustered index in this case. Explaining this command is outside of the scope of this paper but you want to find a row who's PageType = 1. This indicates the page is a data page (root) as compared to an intermediate index, or an IAM page. Find the corresponding PageFID (File ID), PagePid (Page Number) results in this row.

Once you grab these numbers, execute the undocumented command dbcc page. The Syntax is dbcc page (dbname/ID, FileNum, PageNum, 3 optional commands). For the case of this article put a '1' where the optional commands sit. In order to return results sets to Management Studio, turn on trace flag 3604 prior to running the command.

DBCC TRACEON (3604)
DBCC PAGE ('adventureworks', 1, 6608, 1)

Troll the results until you find "Diff" in the text. Notice that next to “Diff” it says "Not Changed". This tells you that this page had not been marked for a differential backup. Now go ahead and update the table, and run the command again.

update Production.TransactionHistory
set Quantity = 100
go
DBCC PAGE ('adventureworks', 1, 6608, 1)

The DIFF result now equals "Changed", obviously meaning the page is marked for a differential backup. Go ahead and run a differential backup and issue the DBCC command again.

Backup database adventureworks to disk = 'c:\testing\ad.diff'
with init, differential
go
DBCC PAGE ('adventureworks', 1, 6608, 1)

The output of this command is expected: the "Changed" value remains,  because only a full backup or a restore can change this. However I did prove that you can do a differential backup in simple mode. Run a full backup to set this back to "Not Changed". If you want to get a little wild, use the "copy_only" option of the backup, and look at the results.

Based on these results you would think that this value is in the data page is a form of a flag bit. This isn't the case at all. If this was the case a differential backup would require a full scan of every page in your database. You would end up doing almost the entire work of a full backup, without the disk writes. SQL Server gets around this by using a bitmask structure inside the sixth page of every data file called the DCM. It takes on a similar structure of the GAM and SGAM, by holding pointers to 64,000 extents or 4GB of data. As with the GAM every 4th GB a new DCM appears for tracking the following 4GB.

When a change to a data page occurs the DCM is updated for the changed extent in which the page lives, to indicate the change. When a differential backup is kicked off, rather than scanning all the pages in the database, it simply scans the DCM, which points to the needed data. The backup process then only grabs the changed extents, speeding up the process.

So if you happen to be interviewing at Zillow, and your asked if differential backups can occur in simple mode, the answer is yes.

Rate

4.56 (18)

You rated this post out of 5. Change rating

Share

Share

Rate

4.56 (18)

You rated this post out of 5. Change rating