Automated backup plan

  • I've been reading about and configuring an automated backup strategy, and would appreciate a comment or two on my plans. First of all, the databases are small and not time-critical or mission-critical. Losing them altogether would be a disaster, but losing a day, or even several days worth of work would be no more than a pain in the butt.

    It seems to me that using both differential backups and transaction log backups only makes sense for larger, sensitive databases. I have an enormous amount of storage space available, and a transaction log backup appears to use just about as much space as a differential backup. Using both seems to me would add an extra level of complexity to potential restore operations, for little benefit that I can see.

    My thought is to take a full backup every night, zip and spool the zipped copies off to various remote backup devices (both on and off site) and take either transaction log or differential backups every hour through the workday. The full backups, zipped, are around 10MB, and I have an almost empty 3TB drive for primary backup storage. Offsite capacities are considerably larger.

    Does this seem like a sensible plan? If so, would differential or transaction log backups during the day be better? I would guess differential, since you only need one good one for a restore operation, rather than having to chain a potential day's worth of transaction logs together, but I'd like to hear an opinion from someone who's had to perform such chores - I haven't had to (yet:-)).

    One thing does occur to me, that doing a routine like weekly full backup, daily differential, hourly transaction log would be good practice in a relatively small-scale environment, like I have now, rather than trying to deal with it all suddenly, if I do get something massive dropped on my head at some future date. Would it make sense to set up an unnecessarily complex backup routine just to get proficient at it? I'm not exactly short of projects, but this might be time well invested.

  • I'm not an expert in this area (I'm just learning) but an advice I've read says that you shouldn't have a backup plan. You should have a recovery plan:-D.

    You need to be sure that the backups work and you can use them in case of a disaster.

    This might add some work to your plan, but it'll give you something more trustful.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • You need to work out the Recovery Point Objective (RPO) and Recovery Time Objective (RTO) for each of your databases.

    Both of these will help you work out the appropriate recovery model and frequency/type of backups required.

    ... and as Luis mentioned above, you need to automate a restore process to ensure that all the lovely backups you have are actually worth the space they occupy!

    Good luck and let me know of you need further help as I went through a similar exercise recently.

    ---------------------------------------------------------

    It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
    David Edwards - Media lens[/url]

    Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
    Howard Zinn

  • Yes, I've read that sort of advice as well, and it's certainly true, but it's difficult to restore something without first having backed it up. I was looking for some thoughts on the planned frequency and type of backups, given what I described about the databases. If the correct answer flowed automatically from an analysis of the situation, I wouldn't have posted my question at all.

    I have no experience with restoring a hamburgered DB, just a few cases when I've destroyed something myself during testing, when all users were offline. In that case, I just deleted the database entirely, and did a complete restore from the full backup, to get back to pristine. Restoring something that users blow up, and maybe don't notice it for a while, is quite another matter. Experience with that provides insights that no amount of studying can replace.

  • I agree with everyone who's said you need to be concerned about recovery, but I also understand where you're coming from.

    I'd say a weekly full (I like Sunday nights, myself, just for simplicity's sake), a daily, and an hourly would be good.

    But I'd also run them in a test environment and make sure they can be restored completely without issue.

    _____________________________________________________________________
    -Jamie Scharbrough
    MCTS: SQL 2008R2

  • Jamie Scharbrough (10/17/2012)


    I agree with everyone who's said you need to be concerned about recovery, but I also understand where you're coming from.

    I'd say a weekly full (I like Sunday nights, myself, just for simplicity's sake), a daily, and an hourly would be good.

    But I'd also run them in a test environment and make sure they can be restored completely without issue.

    I agree as well - recovery is the ultimate goal, but once again, I can't even attempt a recovery if I haven't made a backup, and I can't make a regularly scheduled backup if I don't have some sort of plan. Testing restores is all well and good, but that comes AFTER having made a backup, to ensure that the backup was done correctly, and that it's useable.

    Simplicity was exactly my thought - with a small database, does mixing full, differential and transaction backups make sense, or is it unnecessarily complicated? Obviously, a full backup once per minute would be the simplest from a recovery standpoint, but that's hardly practical.

    You think, then, that such a cascaded full/differential/transaction approach makes sense, even given the situation I've described? If so, may I ask for your reasoning on the matter? It appears overly complex to me, but maybe I'm missing something, which is, after all, why I posted my original question.

  • pdanes (10/17/2012)


    You think, then, that such a cascaded full/differential/transaction approach makes sense, even given the situation I've described? If so, may I ask for your reasoning on the matter? It appears overly complex to me, but maybe I'm missing something, which is, after all, why I posted my original question.

    So let's take one of your DBs as an example. What's the most that your users can lose? And how much time can they wait for the DB to come back online?

    ---------------------------------------------------------

    It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
    David Edwards - Media lens[/url]

    Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
    Howard Zinn

  • pdanes (10/17/2012)

    I agree as well - recovery is the ultimate goal, but once again, I can't even attempt a recovery if I haven't made a backup, and I can't make a regularly scheduled backup if I don't have some sort of plan. Testing restores is all well and good, but that comes AFTER having made a backup, to ensure that the backup was done correctly, and that it's useable.

    Simplicity was exactly my thought - with a small database, does mixing full, differential and transaction backups make sense, or is it unnecessarily complicated? Obviously, a full backup once per minute would be the simplest from a recovery standpoint, but that's hardly practical.

    You think, then, that such a cascaded full/differential/transaction approach makes sense, even given the situation I've described? If so, may I ask for your reasoning on the matter? It appears overly complex to me, but maybe I'm missing something, which is, after all, why I posted my original question.

    Well, it also depends on how small we're talking and how comfortable you are with backups in general.

    Personally, I'm currently running a small test server, and I do just as I've described, because I like knowing I can back up from any point.

    This is honestly one where it really depends on you and your comfort zone. If you feel that a weekly full and a daily differential are sufficient, then go for it. I'd also try "breaking" it at some point too; if your backup is corrupted, if your backup just doesn't restore, whatever, so you can learn how to fix it too.

    I guess my feeling is you said before that you wanted to practice. Doing the full monty here will help you with that, and for larger databases I know best practices recommend fulls/diffs/trans. So you might as well go whole hog to start with.

    But this is just my opinion; feel free to take it with a grain a salt! 😀

    _____________________________________________________________________
    -Jamie Scharbrough
    MCTS: SQL 2008R2

  • Abu Dina (10/17/2012)


    pdanes (10/17/2012)


    You think, then, that such a cascaded full/differential/transaction approach makes sense, even given the situation I've described? If so, may I ask for your reasoning on the matter? It appears overly complex to me, but maybe I'm missing something, which is, after all, why I posted my original question.

    So let's take one of your DBs as an example. What's the most that your users can lose? And how much time can they wait for the DB to come back online?

    There are currently six, the parameters for all are similar.

    As I wrote in the original post, losing even several days worth of work is a bloody nuisance, but by no means a tragedy. That's certainly not my ideal, which is why I was thinking a full backup nightly, and either differential or transaction backups once per hour, and if I can somehow get a reliable "last changed" timestamp from the database, I will run the hourly only if someone has been using the database - there's no point in making repeated snapshots of something that isn't changing at all, which is often the case.

    How long isn't a pressing issue either. Earlier this year, one of the drives in a server's RAID group went down just as several of us were leaving the country for a few days, and no replacement was immediately available. Rather than risk a real disaster, I simply shut the server down, told everyone that I would get on it as soon as I got back and we left. No sweat, people just worked on other things. I scored a drive from IT on my way back, plugged it in, waited a while for the RAID to rebuild and we were back in business. Again, several days is not my goal, but it's tolerable, as long as it doesn't happen often.

    In practical terms, losing up to an hour's worth of work is fine, and if I can get it fixed by the next day, that's acceptable as well.

    Does that help?

  • Luis Cazares (10/17/2012)


    I'm not an expert in this area (I'm just learning) but an advice I've read says that you shouldn't have a backup plan. You should have a recovery plan

    So very very true.

    pdanes (10/17/2012)


    Yes, I've read that sort of advice as well, and it's certainly true, but it's difficult to restore something without first having backed it up.

    Which kind of misses the point about the above advice (having a recovery plan). Recovery plans aren't about restoring the databases. Recovery plans are about "How much data can you afford to lose?"

    Before you design your backups, you need to know the following:

    1) How long can the database be down (and non-functional) while you are restoring?

    2) How many minutes, hours, or days behind can your data be once you have restored? (I.E., how much data can you afford to lose?)

    3) What kind of system (OLTP, OLAP, etc.) are you planning backups for?

    4) How much your data changes over the course of a day, a week, and a month?

    5) Is the database-read only or read-write?

    6) How big is the database?

    These are all very important questions to know the answer to before you decide what kind of backups you do, how often you do them, and how many backups you keep. If you don't know the answers to these questions, you're just guessing and guessing can get people into trouble.

    Recovery strategies are very important. If you need to be able to do Point-in-Time restores, for instance, the database recovery mode is going to be set to FULL (or should be). If your database is in the Terabyte / Petabyte size range, you might be doing file / filegroup backups instead of full backups. Or you might be doing partial backups. If your database is read only, you don't need to back it up every day. Whereas if you're running a highly active transactional database where data gets updated on a minute to minute basis (think airplane companies or hotels), you might need differentials and complete / full backups along with transaction log backups.

    Remember, a good strategy isn't about restoring a database. It's about optimizing your time and data protection. You don't need a backup to start with a recovery strategy. But if your databases aren't getting backed up at all, set up complete / full backups and transaction log backups with FULL recovery mode on until you do get your recovery strategy set. That's the safest bet to cover your bases.

    Then sit down with your business users and establish a reasonable recovery plan SLA before Bad Things<tm> start to happen.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • Jamie Scharbrough (10/17/2012)


    Well, it also depends on how small we're talking and how comfortable you are with backups in general.

    Biggest DB is a bit over 100MB, and my experience with things like this is minimal. I've been making manual full backups and stashing them offline, but I want to build something automatic and more up-to-date.

    Personally, I'm currently running a small test server, and I do just as I've described, because I like knowing I can back up from any point.

    That's what I would like to be able to do. But I'm still wondering, does this three-stage approach make that any more doable? If I take a differential backup every hour, I can (theoretically, I know) restore in two steps. The last full and the last differential. If I use transaction logs, it can be much more complex, no? As an example, say I have a DB that has a nightly full backup and an hourly differential during the course of a workday, starting at 9am. If the DB crashes at 4:30, I can restore from the nightly and the 4pm diff (or earlier, if it turns out the error occurred earlier and didn't show up until 4:30). If I use transaction log backups instead of differentials, I have to use the nightly full, then the 9am trans, then the 10am trans, then the 11am trans.... And if it turns out that the 4pm trans included a screwup, I have to start the whole process again, only stopping at 3pm, then if it's still there, again, stopping at 2pm... If I go with the full three-stage process, I have to use the last weekly full, then last night's diff, then all trans logs up to the problem point. Isn't that much more complex, with many more possible points of failure? Or maybe that's not how these things work and I don't yet understand all there is about the various backup methods.

    This is honestly one where it really depends on you and your comfort zone. If you feel that a weekly full and a daily differential are sufficient, then go for it. I'd also try "breaking" it at some point too; if your backup is corrupted, if your backup just doesn't restore, whatever, so you can learn how to fix it too.

    No, I'd like an hourly snapshot, I'm just not clear on what is the best way of achieving it.

    I guess my feeling is you said before that you wanted to practice. Doing the full monty here will help you with that, and for larger databases I know best practices recommend fulls/diffs/trans. So you might as well go whole hog to start with.

    But this is just my opinion; feel free to take it with a grain a salt! 😀

    I always do that - even if I don't agree with an opinion or don't use some advice, it often helps immensely to hear other people's thoughts on the matter, and having to explain my position clearly enough to get it across to someone else is a great aid in organizing my own thoughts.

    But I can certainly use the practice, so I may set it up full-bore just to get familiar with the process. I'd just like to know clearly what I'm doing and why I'm doing it, rather than follow recipes blindly.

  • That's what I would like to be able to do. But I'm still wondering, does this three-stage approach make that any more doable? If I take a differential backup every hour, I can (theoretically, I know) restore in two steps. The last full and the last differential. If I use transaction logs, it can be much more complex, no? As an example, say I have a DB that has a nightly full backup and an hourly differential during the course of a workday, starting at 9am. If the DB crashes at 4:30, I can restore from the nightly and the 4pm diff (or earlier, if it turns out the error occurred earlier and didn't show up until 4:30). If I use transaction log backups instead of differentials, I have to use the nightly full, then the 9am trans, then the 10am trans, then the 11am trans.... And if it turns out that the 4pm trans included a screwup, I have to start the whole process again, only stopping at 3pm, then if it's still there, again, stopping at 2pm... If I go with the full three-stage process, I have to use the last weekly full, then last night's diff, then all trans logs up to the problem point. Isn't that much more complex, with many more possible points of failure? Or maybe that's not how these things work and I don't yet understand all there is about the various backup methods.

    The answer to this is: It depends.

    I guess I'm approaching this with the view that this is a learning enviornment, and so you should do all the steps for practice.

    Let's say you do the three different backups and find that your 4 pm trans doesn't work. If you find that your 3 and 2 pms also don't work...well, personally, if I was in a test or learning enviornment...I'd be glad I chose to have a more complex set up, because essentially, it caught the problem earlier. At this point, you'd stop and try and figure out what the screw up is.

    Yes, you'll lose data, but would you have lost MORE data if you'd only done the two backups?

    And yes, there are more points of failure, but if you're doing this to practice automatic backups, you might as well catch the snags earlier, since you'll probably have to do all three backups with a larger database.

    No, I'd like an hourly snapshot, I'm just not clear on what is the best way of achieving it.

    I'm not sure that's possible, as snapshots only include full backups. There may be a way to do this via a 3rd party tool; I'd have to look into this. (I'm pretty new to snapshots). You can use them for differentials, but you'll need the VSS interface.

    I always do that - even if I don't agree with an opinion or don't use some advice, it often helps immensely to hear other people's thoughts on the matter, and having to explain my position clearly enough to get it across to someone else is a great aid in organizing my own thoughts.

    But I can certainly use the practice, so I may set it up full-bore just to get familiar with the process. I'd just like to know clearly what I'm doing and why I'm doing it, rather than follow recipes blindly.

    _____________________________________________________________________
    -Jamie Scharbrough
    MCTS: SQL 2008R2

  • After reading the posts on this thread I'm sensing you already kinda know the acceptable thresholds of data loss based on experience in the company. That being said confirm them with your business owners. While disaster recovery is and can be very complex and detailed, I'm trying to simplify it in this post.

    You're dealing with very small databases, the IO cost to back them up is not daunting to your servers performance I'm sure (but then again I don't know the specs and you should check this before hand, I'm basing this off the low MB numbers you've mentioned and semi-modern hardware).

    I propose based on what I've read:

    Full Backup - Weekly (day with least SQL activity)

    Differential Backup - Every 12 hours (excluding the time frame you take your full backup)

    You haven't mentioned retention much so you'll want to determine that, don't forget about it or one day your drive will fill with the backups 😉

    From a recovery standpoint though it will be pretty simple, restore the latest weekly full backup and then restore the latest differential.

    If you do the differentials at 12 AM and 12 PM (again, assuming that's acceptable regarding workload/performance) you could have a failure at 9 AM, and you restore back to midnight, or if its in the afternoon you restore back to 12 noon. While this doesn't give you point in time recovery based on your details it keeps the recovery well within acceptance. (DBs using this strategy should be in SIMPLE recovery model as Point-In-Time recovery is not needed, you will let the engine manage your .LDFs)

    If your DBs start growing quickly, the IO costs are going to increase and you will need to re-adjust, keep that in mind. DR is never a set it and forget it deal. Also, it's imperative to test the backups once you've implemented your backup strategy. If you're wondering why, here is a great read:

    http://www.sqlservercentral.com/articles/Backups/93224/

    Best Regard!

  • Jamie Scharbrough (10/17/2012)


    The answer to this is: It depends.

    I guess I'm approaching this with the view that this is a learning enviornment, and so you should do all the steps for practice.

    That's certainly legit, but I'm trying to comprehend some justifications for various strategies at the same time.

    Let's say you do the three different backups and find that your 4 pm trans doesn't work. If you find that your 3 and 2 pms also don't work...well, personally, if I was in a test or learning enviornment...I'd be glad I chose to have a more complex set up, because essentially, it caught the problem earlier. At this point, you'd stop and try and figure out what the screw up is.

    Yes, you'll lose data, but would you have lost MORE data if you'd only done the two backups?

    I think I'm not explaining myself correctly. Here's a more detailed try:

    I can do a nightly full and an hourly trans (9am, 10am, 11am...)

    - or -

    a nightly full and an hourly differential (9am, 10am, 11am...)

    In the first case, if I want to recover to 3pm status, I'll have to use the nightly full and the 9am trans, the 10am trans, the 11am trans, the 12noon trans, the 1pm trans, the 2pm trans and the 3pm trans, to recover all the things that happened that day.

    In the second case, to get to 3pm, I need only the nightly full and the 3pm diff, no?

    If I comprehended the concept of the trans and differential correctly, the differential carries everything since the last full, while the trans carries only the stuff since the last trans, potentially requiring numerous steps to recover to a certain point, while a diff can do it in one step, at the cost of more space used in the backup file.

    Also, if one of the trans files is corrupt, everything after that is unusable, since trans log restores build on one another. If one of the diff files is corrupt, the one just after may well be okay. In a trans recovery chain, -every- file in the chain has to be okay. Restore from a diff file is not dependent on any other diff files.

    No, I'd like an hourly snapshot, I'm just not clear on what is the best way of achieving it.

    I'm not sure that's possible, as snapshots only include full backups. There may be a way to do this via a 3rd party tool; I'd have to look into this. (I'm pretty new to snapshots). You can use them for differentials, but you'll need the VSS interface.

    I think my terminology was sloppy here. I meant a backup-restore point, of the sort achieved by one of the normal SQL backup operations, not something exotic.

  • pdanes (10/17/2012)


    Jamie Scharbrough (10/17/2012)


    The answer to this is: It depends.

    I guess I'm approaching this with the view that this is a learning enviornment, and so you should do all the steps for practice.

    That's certainly legit, but I'm trying to comprehend some justifications for various strategies at the same time.

    I think I'm not explaining myself correctly. Here's a more detailed try:

    I can do a nightly full and an hourly trans (9am, 10am, 11am...)

    - or -

    a nightly full and an hourly differential (9am, 10am, 11am...)

    In the first case, if I want to recover to 3pm status, I'll have to use the nightly full and the 9am trans, the 10am trans, the 11am trans, the 12noon trans, the 1pm trans, the 2pm trans and the 3pm trans, to recover all the things that happened that day.

    In the second case, to get to 3pm, I need only the nightly full and the 3pm diff, no?

    If I comprehended the concept of the trans and differential correctly, the differential carries everything since the last full, while the trans carries only the stuff since the last trans, potentially requiring numerous steps to recover to a certain point, while a diff can do it in one step, at the cost of more space used in the backup file.

    Also, if one of the trans files is corrupt, everything after that is unusable, since trans log restores build on one another. If one of the diff files is corrupt, the one just after may well be okay. In a trans recovery chain, -every- file in the chain has to be okay. Restore from a diff file is not dependent on any other diff files.

    Okay I get what you're saying. And yes, you're right; so your best bet WOULD be to do a nightly full and an hourly diff. Sorry, I was hung up on the learning part.

    I'm not sure that's possible, as snapshots only include full backups. There may be a way to do this via a 3rd party tool; I'd have to look into this. (I'm pretty new to snapshots). You can use them for differentials, but you'll need the VSS interface.

    I think my terminology was sloppy here. I meant a backup-restore point, of the sort achieved by one of the normal SQL backup operations, not something exotic.

    Ah, okay. If you want a specific point in time, a simple T-SQL statement could do it, or you could create a job.

    _____________________________________________________________________
    -Jamie Scharbrough
    MCTS: SQL 2008R2

Viewing 15 posts - 1 through 15 (of 21 total)

You must be logged in to reply to this topic. Login to reply