SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 

What’s a differential backup?

Of the different basic types of backups (full, differential and log) I find the differential the most interesting, and frequently the least understood. Full backups are easy. The whole database including any log information needed to make the committed transactions durable. Transaction logs aren’t much more complicated. They contain information on any transactions committed since the last transaction log. (yes, I’m simplifying on both. sue me)

What makes differentials different? They contain any extents (8 pages) that have changed since the last full backup. Sounds simple right? And it is. Ok, so if it’s simple why would it be misunderstood? Because it backs up changed extents. Not transactions. If one small piece of one row in a given extent has been changed then that whole extent will be included in the differential backup. This might seem a bit wasteful. I mean a changing a single bit means the entire extent of 64kb gets written to the backup. On the other hand, if that same small piece of information is changed 100, 1000, 10000 times that extent will still only be included in the differential backup once. (Interestingly data can and will be included multiple times in the log backup, once for each transaction where the data is changed.)

So in order to take a differential SQL has to know each extent that’s been changed. It would seem like that could be a fair amount of work if the database is of any reasonable size. In fact, SQL is pretty efficient. It keeps what’s called a DCM (differential change map) that has a flag for each extent within a fairly large amount of space. (as I understand it just shy of 4gb) By checking that one page SQL knows which extents out of that 4gb of data needs to be written to the differential. I honestly find this incredibly cool. One single page tells SQL what extents over 4gb of data need to be backed up.

The other cool bit, and I mean really really cool, is the fact that essentially the differential is an overlay of the data. Let me explain. A differential has a starting and ending LSN (log sequence number). If the current database state is anywhere between those two LSN then applying the differential will immediately bring it up to that final LSN. It makes absolutely no difference where the database is along the chain between the beginning LSN and ending, the differential brings it up to current (for the differential).

As a somewhat concrete example let’s say you have a database with a single table that looks like this (yea, I know I’m simplifying again) at 1pm on Friday.

Employees

Name Address
Kenneth Fisher 123 Smith Street

Now, you get updates

  • 1pm Friday: Full backup taken
  • 2pm Friday: Address changed to 124 Smith Street
    Transaction Log Backup 1
  • 4pm Friday: First name changed to Bob
    Transaction Log Backup 2
  • 10pm Friday: Address changed to 123 Smith Street
    Transaction Log Backup 3
  • 2am Saturday: Last name changed to Smith
    Transaction Log Backup 4
  • 10am Saturday: Address changed to 123 Fisher Street
    Transaction Log Backup 5
  • 3pm Saturday: Differential taken

 
Once we’ve restored the FULL you can either apply a bunch of transaction log backups, and each is going to basically run every transaction to that point against the database, or you can apply the differential. Or, in fact you can apply some of the transaction logs and then the differential. As long as the database isn’t marked recovered, it’s going to just overwrite whatever is there with the final value.

Employees

Name Address
Bob Smith 123 Fisher Street

Completely skipping any intervening transactions or values.

And the last couple of things I want to point out:

  1. A differential file could get as large as the full backup but no larger, although it probably won’t.
  2. Differential backups are available on all recovery types. Full, Bulk and Simple.

SQLStudies

My name is Kenneth Fisher and I am Senior DBA for a large (multi-national) insurance company. I have been working with databases for over 20 years starting with Clarion and Foxpro. I’ve been working with SQL Server for 12 years but have only really started “studying” the subject for the last 3. I don’t have any real "specialities" but I enjoy trouble shooting and teaching. Thus far I’ve earned by MCITP Database Administrator 2008, MCTS Database Administrator 2005, and MCTS Database Developer 2008. I’m currently studying for my MCITP Database Developer 2008 and should start in on the 2012 exams next year. My blog is at www.sqlstudies.com.

Comments

Leave a comment on the original post [sqlstudies.com, opens in a new window]

Loading comments...