January 27, 2009 at 9:38 am
So...here is my question. (sorry if this gets a little long, trying to include everything at once)
Lets say I have a backup job or a log backup job with the following syntax
BACKUP DATABASE [DB_Lee] TO [DB_LEEBackup] WITH INIT, NOFORMAT, SKIP, NAME = N'DB_Lee - Full DB Backup', STATS = 10
or
BACKUP LOG [DB_Lee] TO [DB_LeeBackup] WITH NOINIT, NOFORMAT, SKIP, NAME = N'DB_Lee - Log Backup', STATS = 10
I have the WITH INT, NOFORMAT, SKIP options which are the standards set here at my job for DB Backups and for Log backups.
I have a couple of scenario's that happen frequently and the workarounds that have been in place here when backups fail or when log backups fail that I would like to ask the experts around here about since I can't get a decent answer from my colleagues.
First off, we get a random event where a backup fails or a log backup fails with the following type error:
The backup data at the end of "DB_Lee(\\FileServer_A\sqlbackup\ServerName\DB_lee.bak)" is incorrectly formatted.
Backup sets on the media might be damaged and unusable.
So running the above database backup will continue to fail since the backup job has "NOFORMAT". The work around I found on a KB article was to change the job to FORMAT which would erase the backup device. http://support.microsoft.com/kb/290787
The way things are generally done here at my work, is the backup file is just manually deleted and a backup is re-run. No biggy...I guess.
My issue is that sometimes people miss these failures or just ignore them...and the failures continue "human nature I guess". So the log doesn't get backed up and continues to grow and in some cases has filled the drive which is obviously bad...
My question is this:
What is the preferred setting and why? We either do our backups to a local drive and then copy to a file server or backup to file servers directly which then get backed up to tape on a different schedule.
It seems to me that the NOFORMAT clause can cause issues as it doesn't allow the db backup or the log backup to continue once the backup device becomes invalid vs having the FORMAT clause which apparently would fix the issues at least for the DB Backup job (which gets run daily here).
The only real thing that appears to be a positive (at least that I have seen) for the NOFORMAT option is that when a DB backup job fails "network glitch", "insert reason here" that the backup device doesn't become invalid and the log backups can continue to backup to that device.
I guess I haven't tested out specific disaster recovery scenario's, but I would really like some input on what everyone else does to get a better picture.
Thanks in advance,
Lee
January 27, 2009 at 9:48 am
Lee Hart (1/27/2009)
The way things are generally done here at my work, is the backup file is just manually deleted and a backup is re-run. No biggy...I guess.
No biggie?
By deleting the backup file, you've deleted all the log backups, and your full backup. What will you do if the DB fails at that point? What's the latest that's available on the file server if that happens?
What is the preferred setting and why? We either do our backups to a local drive and then copy to a file server or backup to file servers directly which then get backed up to tape on a different schedule.
Each log backup should go to a separate file, not get appended to a single file. Each database backup should go to a separate file. Both on the local disk. Copy them elsewhere afterwards and archive wherever.
There's no good reason (other than convenience) to append backups to a single file. Steve did an editorial about this a while back. Let me see if I can find it...
http://www.sqlservercentral.com/articles/Editorial/65492/
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 27, 2009 at 10:11 am
Lee - Steve just wrote an editorial on this with some associated discussion that can be found here http://www.sqlservercentral.com/articles/Editorial/65492/ It would certainly be worthwhile to read that and some of the comments associated with it.
Certainly some of the scenarios that you are discussing are common with backup devices hence the reason that many avoid them. The most important consideration in our backup configuration is ensuring that in all circumstances we can recover our database in accordance with the businesses requirements / expectations. I wonder if that is true when you have problems with your devices as you describe and they become unusable. It would certainly be something that I would want to test and validate, if for no other reason that for your own sleep at night.
I remember when I first started as a DBA someone stressed the importance of certain items that a DBA takes care of. The top 3 were backups. Hopefully that is enough encouragement for you to do some digging and testing... 🙂
David
@SQLTentmaker“He is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot
January 27, 2009 at 10:14 am
GilaMonster (1/27/2009)
Lee Hart (1/27/2009)
The way things are generally done here at my work, is the backup file is just manually deleted and a backup is re-run. No biggy...I guess.No biggie?
By deleting the backup file, you've deleted all the log backups, and your full backup. What will you do if the DB fails at that point? What's the latest that's available on the file server if that happens?
What is the preferred setting and why? We either do our backups to a local drive and then copy to a file server or backup to file servers directly which then get backed up to tape on a different schedule.
Each log backup should go to a separate file, not get appended to a single file. Each database backup should go to a separate file. Both on the local disk. Copy them elsewhere afterwards and archive wherever.
There's no good reason (other than convenience) to append backups to a single file. Steve did an editorial about this a while back. Let me see if I can find it...
I didn't mean to make light of the importance of backups and I am fully aware of what can happen when you delete the backup file and have a failure...no matter how small the chance there is still that chance, so bad choice of words on my part completely.
I generally rename the backup file in question to something like xxx.bak.old until the new backup completes when I come across failures like this.
To expand on your comments about putting all backups into a single device / file vs. having a single file for each backup...
how do you facilitate something like that? I don't want to sound ignorant, I learned on the job so to speak and took the methods that were in place before I got here as tried and true but like you have pointed out I am starting to question some of the reasons and methods which is why I am asking.
As an example, lets say you have a server and that server has 50 databases, each in FULL recovery mode...
50 x 23 log backups would be approx. 1150 individual files not including the 50 db backup files...
Is that correct? when you say one file for each backup?
So instead of having DB_LeeBackup.bak (1 db backup and 23 log backups)
You would have:
DB_LeeBackup.bak (DB Backup file)
DB_Lee_LogBackup_1.bak
DB_Lee_LogBackup_2.bak
DB_Lee_LogBackup_3.bak
DB_Lee_LogBackup_4.bak
...
DB_Lee_LogBackup_23.bak
Thanks,
Lee
January 27, 2009 at 10:28 am
Yes, you would have that many files. There are many scripts that will facilitate the creation of the individual files with date / time stamps as appropriate and there are even some scripts which can support a more automated recovery approach. So, if the file count seems a bit overwhelming, don't let it.
Search the site for some great examples, and post back with questions if you have problems.
David
@SQLTentmaker“He is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot
January 27, 2009 at 10:31 am
DavidB (1/27/2009)
Lee - Steve just wrote an editorial on this with some associated discussion that can be found here http://www.sqlservercentral.com/articles/Editorial/65492/ It would certainly be worthwhile to read that and some of the comments associated with it.Certainly some of the scenarios that you are discussing are common with backup devices hence the reason that many avoid them. The most important consideration in our backup configuration is ensuring that in all circumstances we can recover our database in accordance with the businesses requirements / expectations. I wonder if that is true when you have problems with your devices as you describe and they become unusable. It would certainly be something that I would want to test and validate, if for no other reason that for your own sleep at night.
I remember when I first started as a DBA someone stressed the importance of certain items that a DBA takes care of. The top 3 were backups. Hopefully that is enough encouragement for you to do some digging and testing... 🙂
I am reading the post now and it is very interesting to see how others practice the art of backups.
It seems everyone has their own methods, and some have extra constraints they have to deal with. I have a couple of hurdles trying to over come, one being "the norm" stigma. Also space concerns as it is always hard to manage space. Just production, we have around 120 SQL servers, the biggest DB server "in size" is around 200 gb's. but the rest are generally between 8 and 60 GB's
To go a little more in-depth, in our environment each server backups up to a File Server that has its own folder, and each database has its own file (Currently).
Like I said in the above post, I am questioning that methodology as it is listed in some of the forum replies to that article and am looking for advice.
Lee
January 27, 2009 at 10:34 am
Something like that.
One folder per database, and both database and log backups get time stamps. So, for 2 databases, I'd have something like this:
DB1\DB1_Full_20090101.bak
DB1\DB1_200901010800.trn
DB1\DB1_200901010815.trn
DB1\DB1_200901010830.trn
DB1\DB1_200901010845.trn
DB1\DB1_200901010900.trn
DB1\DB1_200901010915.trn
DB1\DB1_200901010930.trn
DB2\DB2_Full_20080101.bak
DB2\DB2_200901010800.trn
DB2\DB2_200901010815.trn
DB2\DB2_200901010830.trn
DB2\DB2_200901010845.trn
DB2\DB2_200901010900.trn
DB2\DB2_200901010915.trn
DB2\DB2_200901010930.trn
Now the maintenance plan can do that for you, or you can get one of many of the backup scripts from the script library here that will do that for you.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 27, 2009 at 11:07 am
GilaMonster (1/27/2009)
Something like that.One folder per database, and both database and log backups get time stamps. So, for 2 databases, I'd have something like this:
DB1\DB1_Full_20090101.bak
DB1\DB1_200901010800.trn
DB1\DB1_200901010815.trn
DB1\DB1_200901010830.trn
DB1\DB1_200901010845.trn
DB1\DB1_200901010900.trn
DB1\DB1_200901010915.trn
DB1\DB1_200901010930.trn
DB2\DB2_Full_20080101.bak
DB2\DB2_200901010800.trn
DB2\DB2_200901010815.trn
DB2\DB2_200901010830.trn
DB2\DB2_200901010845.trn
DB2\DB2_200901010900.trn
DB2\DB2_200901010915.trn
DB2\DB2_200901010930.trn
Now the maintenance plan can do that for you, or you can get one of many of the backup scripts from the script library here that will do that for you.
Hi Gail,
After thinking about it further I went and spoke with my manager to get his view point as to the Why...
He gave the reason of the single file for a database backup (db & log) because that recovery model fits with the archiving methods in place. The current backup to tape software works on a schedule and the schedule is independent to the sql server backups.
The short version also included the conversation about removing older files that aren't needed. I am sure there are scripts that are also in the script library which facilitate the removal of older / not needed files as well...
however scheduling the removal of old files vs. space concerns vs. conflicting schedules of the tape backup software (as it backs up everything not just sql backups) appears to be more of a concern vs. having a single point of failure.
Moving forward in the near future with SQL 2008 and the implementation of SAP and a contracted Data Center I am told that we are going to a new model, one like you describe where each backup is its own file and the tape backup software is streaming (within a set timeframe "30 minutes or 1 hour") where it looks periodically for new files and streams them off the backup server vs. our current tape backup software which runs on a schedule that can be influenced by many different variables.
hope this makes some sense...and Thank you for the comments they have really helped open some new windows for me to review.
Lee
Lee
January 27, 2009 at 11:41 am
Lee - Hopefully my interjections aren't causing confusion in conjunction with Gail's comments. If so, go with hers. 🙂
The tape backup situation should not be an issue. I'm assuming that the tape will backup your devices that you are using now once in a 24 hour period. The same would be true with the folders containing the files. There is always the potential that you have a file that is "open" or being written to at the time of the tape backup but that will be overcome with the following day's backups as it will get that file when it is not being written to. So, as long as you have 2 days on disk then you can keep yourself really well covered.
I actually take it a step further and use a script to delete files older than N days IF the file has the archive bit set which is only set by the tape backup process. This allows me to ensure that I first have the backup on tape before it ever gets removed from disk. No gaps, and no gaps is good. Poor English, good recovery.
Either way, it sounds like you are in a phase where you are going to be designing a new solution. I would certainly make testing your current solution part of that planning and to ensure that you are covered. Have someone draw a pick a day from that last month and see if you can restore to that point (on a different server of course). That is always a telling test. Great for opening the eyes of management as to the effectiveness of our backup / recovery process.
Hope this is helpful.
David
@SQLTentmaker“He is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot
January 27, 2009 at 1:47 pm
And just to add to your reading pleasure, Paul Randal wrote this article. It may give you a little better understanding of recovery. Not necessarily what is right for your situtation but maybe a little more knowledge on the subject.
http://technet.microsoft.com/en-us/magazine/2009.02.logging.aspx
-- You can't be late until you show up.
January 27, 2009 at 1:54 pm
DavidB (1/27/2009)
Lee - Hopefully my interjections aren't causing confusion in conjunction with Gail's comments. If so, go with hers. 🙂The tape backup situation should not be an issue. I'm assuming that the tape will backup your devices that you are using now once in a 24 hour period. The same would be true with the folders containing the files. There is always the potential that you have a file that is "open" or being written to at the time of the tape backup but that will be overcome with the following day's backups as it will get that file when it is not being written to. So, as long as you have 2 days on disk then you can keep yourself really well covered.
I actually take it a step further and use a script to delete files older than N days IF the file has the archive bit set which is only set by the tape backup process. This allows me to ensure that I first have the backup on tape before it ever gets removed from disk. No gaps, and no gaps is good. Poor English, good recovery.
Either way, it sounds like you are in a phase where you are going to be designing a new solution. I would certainly make testing your current solution part of that planning and to ensure that you are covered. Have someone draw a pick a day from that last month and see if you can restore to that point (on a different server of course). That is always a telling test. Great for opening the eyes of management as to the effectiveness of our backup / recovery process.
Hope this is helpful.
Hi David,
This discussion has been very helpful. I see your logic regarding the 2 days of backups on a server and it makes sense...
In our current environment the constraints of space probably wouldn't allow for a similar setup as yours or what you were suggesting.
Just a question about your methods...you mention you...
I actually take it a step further and use a script to delete files older than N days IF the file has the archive bit set which is only set by the tape backup process. This allows me to ensure that I first have the backup on tape before it ever gets removed from disk.
Does this archive bit get set on the backup file somehow? By your tape backup software? Just curious I guess.
Thanks,
Lee
January 27, 2009 at 2:51 pm
Yes, the archive bit does get set on the backup file by the tape backup software. Works well. I believe this to be true of most if not all tape vendors however, I can't speak emphatically on that as I have really only worked with a couple.
As for space, are you using compression software now such as Red-Gate sqlbackup or Quest SQLLiteSpeed? You might want to look at these. The Red-Gate offering is very pocketbook friendly and can win some major points with management and your storage guys as you can see in excess of 75% reduction in your space used for backups.
David
@SQLTentmaker“He is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot
January 27, 2009 at 2:57 pm
tosscrosby (1/27/2009)
And just to add to your reading pleasure, Paul Randal wrote this article. It may give you a little better understanding of recovery. Not necessarily what is right for your situtation but maybe a little more knowledge on the subject.http://technet.microsoft.com/en-us/magazine/2009.02.logging.aspx
Hi Terry,
Thanks for that article, I just got done reading it and I found some of the very technical stuff very helpful...I always like to know the inner workings and I didn't know a couple of the things listed in that article.
In regards of backups and disaster recovery I had an interesting occurrence happen, when testing out a restore script that did a full recovery (DB & Logs).
"I think" based on scheduling of the db backup job and the log backup job that the backup LSN's got out of sequence which invalidated the whole backup set.
For instance...
DB backup was taken on a local drive and start at 01:00
Log backup was taken to a network drive and started at 01:00
When the DB backup was finished it was copied to the network drive.
When I attempted to restore the backup set it failed after the full backup finished and when I tried to restore the logs saying the LSN's were out of sequence basically.
I kind of understand how it happened but not fully and was curious if anyone could describe how such an event could take place.
Thanks again that was a good article!
Lee
Viewing 13 posts - 1 through 13 (of 13 total)
You must be logged in to reply to this topic. Login to reply