Mirrored Backups

,

Today I'd like to go over a new and straight forward feature added to SQL

2005 called a mirrored backup. This gives you the ability to backup to two or

more locations simultaneously (which is not the same thing as a striped backup,

that's a different topic!).

Why is this in interesting feature? I think most of us would agree that

having a good backup is job one for a DBA. But the job doesn't end with the

backup, there is still room for things to go wrong. Backup hard drives fail,

tape drives eat the tape, network admins forget to change the tape, etc. I think

it's fairly common now for DBA's to adopt a disk to disk to tape strategy,

meaning they backup the databases to a hard drive they have access to, then have

that backed up to tape. This provides the ability to easily restore from

yesterday without waiting on someone to find the tape, guards against the

possibility that the tape backup didn't work, and gives you a measure of

redundancy (limited because you'll have only a couple days on disk, tapes should

go back for months). This can still leave open the possibility of losing

everything if the facility is destroyed. Having someone take the tape home each

night or using an off site service is always a good idea, but there is a human

element involved in both of those, and we are prone to fail from time to time.

The answer to the last possibility has been to copy or FTP the backup files

to a different location - down the hall, across the campus, across the country

even. Setting it up isn't terribly difficult, but I bet that the number of

servers where something like this is being done is a small percentage in large

part because we tend to rely on maintenance plans until we need a more complex

strategy. So now we have an extension to the backup syntax that lets us do

something similar to our old copy/FTP process, but it's not baked into the

maintenance plans, so I suspect it won't see a lot of user either.

Let's look at how it works. I started with a standard full backup of

Adventureworks on one of the machines in our classroom.

backup database adventureworks to disk='c:\advtest1.bak'
Processed 2 pages for database 'adventureworks', file 'AdventureWorks_Log' on file 1.
BACKUP DATABASE successfully processed 20938 pages in 8.294 seconds (20.680 MB/sec).

To get an idea of how much the network would add to the process, I did the

same backup to another machine in the classroom:

backup database adventureworks to disk='\\train4\backup\advtest1.bak'
Processed 20936 pages for database 'adventureworks', file 'AdventureWorks_Data' on file 1.
Processed 2 pages for database 'adventureworks', file 'AdventureWorks_Log' on file 1.
BACKUP DATABASE successfully processed 20938 pages in 21.017 seconds (8.161 MB/sec).

The time is probably off just a little due to the log backup at the end, but

clearly we can see that it's taking almost twice as long. I received a similar

time when I just copied the file across via xp_cmdshell and the copy command,

but it might be possible to do it faster with xcopy or robocopy, and I didn't

test larger files where any differences might have been more visible. Now let's

try a mirrored backup:

backup database adventureworks to disk='c:\advtest3.bak'
mirror to disk='\\train4\backup\advtest3.bak' with format, description='test'

The syntax is easy enough. Gotchas? The syntax requires the 'with format'

clause to work, and you can only mirror to devices of the same type (all disk or

all tape). Now let's look at the results:

Processed 20936 pages for database 'adventureworks', file 'AdventureWorks_Data' on file 1.
Processed 2 pages for database 'adventureworks', file 'AdventureWorks_Log' on file 1.
BACKUP DATABASE successfully processed 20938 pages in 21.179 seconds (8.098 MB/sec).

As you can see the total time for the backup was equal to the time it took to

do a single backup across the network. BOL states that the backups are

synchronous. That makes sense because you want to only run through the database

pages one time so you'd have to successfully copy the page to all locations

(essentially a distributed transaction) before moving to the next page. I was

curious to see what would happen during a failure so I ran the test again, but

disconnected the network cable as soon as the backup had started. After a couple

seconds I received this error:

Msg 3202, Level 16, State 2, Line 1

Write on "\\train4\backup\advtest9.bak" failed: 64(The specified network name is

no longer available.)

Msg 3271, Level 16, State 1, Line 1

A nonrecoverable I/O error occurred on file "\\train4\backup\advtest9.bak:"

64(The specified network name is no longer available.).

Msg 3013, Level 16, State 1, Line 1

BACKUP DATABASE is terminating abnormally.

At this point I had no BAK file on the local drive, but I did have a 37k file

on the network share (full file would have been about 168k). It makes sense that

the file got left behind as SQL had no way of knowing that the network would

drop and at that point the command had terminated. Worth remembering to have a

check for old files on remote shares if you do decide to use mirroring. The

bigger problem here is that I wanted to add mirroring to accomplish what I used

to do with copy/FTP but it has one big difference; if the copy/FTP fails I still

have the original backup, but if the mirror backup fails I have no backup! The

synchronous nature of the mirroring makes for an all or none experience.

It is worth the risk of not having any backup the one time the job fails

compared to just writing a few lines of code in a job to copy the file after the

backup? There are a few variables to consider; how critical is one missing

backup, would you notice and take action if the job failed, are you mirroring to

local drives or across a slow/fast network. I think it's hard to recommend using

it, especially since it's not a standard part of the maintenance plans and it's

not hard to implement the copy after backup functionality.

Changes I'd like to see:

  • Add a switch that would let me set a configurable timeout so that if one

    mirror was taking an abnormally long time it could be abandoned and the rest

    of the job would continue

  • Continue the backup if any mirror fails (network connection, out of

    space)

  • Add it to the maintenance plans (along with the option to copy after

    instead of mirror)

Are you using backup mirroring, either the native SQL one or perhaps one in a

third party tool? I look forward to your comments.

Andy is a trainer with End to End Training (along with fellow SSC founders

Steve Jones & Brian Knight) focusing on beginning to intermediate level students

in performance tuning, replication, and administration. He is also a frequent

speaker at the PASS Summit and Code Camps in Florida. Visit the web site at

http://www.endtoendtraining.com.

Rate

3 (1)

Share

Share

Rate

3 (1)