Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««123»»

Automated backup plan Expand / Collapse
Author
Message
Posted Wednesday, October 17, 2012 12:35 PM


Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Sunday, August 24, 2014 3:38 PM
Points: 336, Visits: 942
Jamie Scharbrough (10/17/2012)

Well, it also depends on how small we're talking and how comfortable you are with backups in general.

Biggest DB is a bit over 100MB, and my experience with things like this is minimal. I've been making manual full backups and stashing them offline, but I want to build something automatic and more up-to-date.

Personally, I'm currently running a small test server, and I do just as I've described, because I like knowing I can back up from any point.

That's what I would like to be able to do. But I'm still wondering, does this three-stage approach make that any more doable? If I take a differential backup every hour, I can (theoretically, I know) restore in two steps. The last full and the last differential. If I use transaction logs, it can be much more complex, no? As an example, say I have a DB that has a nightly full backup and an hourly differential during the course of a workday, starting at 9am. If the DB crashes at 4:30, I can restore from the nightly and the 4pm diff (or earlier, if it turns out the error occurred earlier and didn't show up until 4:30). If I use transaction log backups instead of differentials, I have to use the nightly full, then the 9am trans, then the 10am trans, then the 11am trans.... And if it turns out that the 4pm trans included a screwup, I have to start the whole process again, only stopping at 3pm, then if it's still there, again, stopping at 2pm... If I go with the full three-stage process, I have to use the last weekly full, then last night's diff, then all trans logs up to the problem point. Isn't that much more complex, with many more possible points of failure? Or maybe that's not how these things work and I don't yet understand all there is about the various backup methods.

This is honestly one where it really depends on you and your comfort zone. If you feel that a weekly full and a daily differential are sufficient, then go for it. I'd also try "breaking" it at some point too; if your backup is corrupted, if your backup just doesn't restore, whatever, so you can learn how to fix it too.

No, I'd like an hourly snapshot, I'm just not clear on what is the best way of achieving it.

I guess my feeling is you said before that you wanted to practice. Doing the full monty here will help you with that, and for larger databases I know best practices recommend fulls/diffs/trans. So you might as well go whole hog to start with.

But this is just my opinion; feel free to take it with a grain a salt! :D

I always do that - even if I don't agree with an opinion or don't use some advice, it often helps immensely to hear other people's thoughts on the matter, and having to explain my position clearly enough to get it across to someone else is a great aid in organizing my own thoughts.

But I can certainly use the practice, so I may set it up full-bore just to get familiar with the process. I'd just like to know clearly what I'm doing and why I'm doing it, rather than follow recipes blindly.
Post #1373999
Posted Wednesday, October 17, 2012 12:52 PM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Tuesday, January 15, 2013 8:43 AM
Points: 37, Visits: 82
That's what I would like to be able to do. But I'm still wondering, does this three-stage approach make that any more doable? If I take a differential backup every hour, I can (theoretically, I know) restore in two steps. The last full and the last differential. If I use transaction logs, it can be much more complex, no? As an example, say I have a DB that has a nightly full backup and an hourly differential during the course of a workday, starting at 9am. If the DB crashes at 4:30, I can restore from the nightly and the 4pm diff (or earlier, if it turns out the error occurred earlier and didn't show up until 4:30). If I use transaction log backups instead of differentials, I have to use the nightly full, then the 9am trans, then the 10am trans, then the 11am trans.... And if it turns out that the 4pm trans included a screwup, I have to start the whole process again, only stopping at 3pm, then if it's still there, again, stopping at 2pm... If I go with the full three-stage process, I have to use the last weekly full, then last night's diff, then all trans logs up to the problem point. Isn't that much more complex, with many more possible points of failure? Or maybe that's not how these things work and I don't yet understand all there is about the various backup methods.


The answer to this is: It depends.

I guess I'm approaching this with the view that this is a learning enviornment, and so you should do all the steps for practice.

Let's say you do the three different backups and find that your 4 pm trans doesn't work. If you find that your 3 and 2 pms also don't work...well, personally, if I was in a test or learning enviornment...I'd be glad I chose to have a more complex set up, because essentially, it caught the problem earlier. At this point, you'd stop and try and figure out what the screw up is.

Yes, you'll lose data, but would you have lost MORE data if you'd only done the two backups?

And yes, there are more points of failure, but if you're doing this to practice automatic backups, you might as well catch the snags earlier, since you'll probably have to do all three backups with a larger database.


No, I'd like an hourly snapshot, I'm just not clear on what is the best way of achieving it.


I'm not sure that's possible, as snapshots only include full backups. There may be a way to do this via a 3rd party tool; I'd have to look into this. (I'm pretty new to snapshots). You can use them for differentials, but you'll need the VSS interface.


I always do that - even if I don't agree with an opinion or don't use some advice, it often helps immensely to hear other people's thoughts on the matter, and having to explain my position clearly enough to get it across to someone else is a great aid in organizing my own thoughts.

But I can certainly use the practice, so I may set it up full-bore just to get familiar with the process. I'd just like to know clearly what I'm doing and why I'm doing it, rather than follow recipes blindly.


_____________________________________________________________________
-Jamie Scharbrough
MCTS: SQL 2008R2
Post #1374008
Posted Wednesday, October 17, 2012 1:21 PM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: 2 days ago @ 4:47 PM
Points: 192, Visits: 400
After reading the posts on this thread I'm sensing you already kinda know the acceptable thresholds of data loss based on experience in the company. That being said confirm them with your business owners. While disaster recovery is and can be very complex and detailed, I'm trying to simplify it in this post.

You're dealing with very small databases, the IO cost to back them up is not daunting to your servers performance I'm sure (but then again I don't know the specs and you should check this before hand, I'm basing this off the low MB numbers you've mentioned and semi-modern hardware).

I propose based on what I've read:

Full Backup - Weekly (day with least SQL activity)
Differential Backup - Every 12 hours (excluding the time frame you take your full backup)

You haven't mentioned retention much so you'll want to determine that, don't forget about it or one day your drive will fill with the backups

From a recovery standpoint though it will be pretty simple, restore the latest weekly full backup and then restore the latest differential.

If you do the differentials at 12 AM and 12 PM (again, assuming that's acceptable regarding workload/performance) you could have a failure at 9 AM, and you restore back to midnight, or if its in the afternoon you restore back to 12 noon. While this doesn't give you point in time recovery based on your details it keeps the recovery well within acceptance. (DBs using this strategy should be in SIMPLE recovery model as Point-In-Time recovery is not needed, you will let the engine manage your .LDFs)

If your DBs start growing quickly, the IO costs are going to increase and you will need to re-adjust, keep that in mind. DR is never a set it and forget it deal. Also, it's imperative to test the backups once you've implemented your backup strategy. If you're wondering why, here is a great read:

http://www.sqlservercentral.com/articles/Backups/93224/

Best Regard!
Post #1374015
Posted Wednesday, October 17, 2012 1:23 PM


Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Sunday, August 24, 2014 3:38 PM
Points: 336, Visits: 942
Jamie Scharbrough (10/17/2012)

The answer to this is: It depends.

I guess I'm approaching this with the view that this is a learning enviornment, and so you should do all the steps for practice.

That's certainly legit, but I'm trying to comprehend some justifications for various strategies at the same time.

Let's say you do the three different backups and find that your 4 pm trans doesn't work. If you find that your 3 and 2 pms also don't work...well, personally, if I was in a test or learning enviornment...I'd be glad I chose to have a more complex set up, because essentially, it caught the problem earlier. At this point, you'd stop and try and figure out what the screw up is.

Yes, you'll lose data, but would you have lost MORE data if you'd only done the two backups?

I think I'm not explaining myself correctly. Here's a more detailed try:

I can do a nightly full and an hourly trans (9am, 10am, 11am...)
- or -
a nightly full and an hourly differential (9am, 10am, 11am...)

In the first case, if I want to recover to 3pm status, I'll have to use the nightly full and the 9am trans, the 10am trans, the 11am trans, the 12noon trans, the 1pm trans, the 2pm trans and the 3pm trans, to recover all the things that happened that day.

In the second case, to get to 3pm, I need only the nightly full and the 3pm diff, no?

If I comprehended the concept of the trans and differential correctly, the differential carries everything since the last full, while the trans carries only the stuff since the last trans, potentially requiring numerous steps to recover to a certain point, while a diff can do it in one step, at the cost of more space used in the backup file.

Also, if one of the trans files is corrupt, everything after that is unusable, since trans log restores build on one another. If one of the diff files is corrupt, the one just after may well be okay. In a trans recovery chain, -every- file in the chain has to be okay. Restore from a diff file is not dependent on any other diff files.

No, I'd like an hourly snapshot, I'm just not clear on what is the best way of achieving it.

I'm not sure that's possible, as snapshots only include full backups. There may be a way to do this via a 3rd party tool; I'd have to look into this. (I'm pretty new to snapshots). You can use them for differentials, but you'll need the VSS interface.

I think my terminology was sloppy here. I meant a backup-restore point, of the sort achieved by one of the normal SQL backup operations, not something exotic.
Post #1374016
Posted Wednesday, October 17, 2012 1:29 PM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Tuesday, January 15, 2013 8:43 AM
Points: 37, Visits: 82
pdanes (10/17/2012)
[quote]Jamie Scharbrough (10/17/2012)

The answer to this is: It depends.

I guess I'm approaching this with the view that this is a learning enviornment, and so you should do all the steps for practice.

That's certainly legit, but I'm trying to comprehend some justifications for various strategies at the same time.


I think I'm not explaining myself correctly. Here's a more detailed try:

I can do a nightly full and an hourly trans (9am, 10am, 11am...)
- or -
a nightly full and an hourly differential (9am, 10am, 11am...)

In the first case, if I want to recover to 3pm status, I'll have to use the nightly full and the 9am trans, the 10am trans, the 11am trans, the 12noon trans, the 1pm trans, the 2pm trans and the 3pm trans, to recover all the things that happened that day.

In the second case, to get to 3pm, I need only the nightly full and the 3pm diff, no?

If I comprehended the concept of the trans and differential correctly, the differential carries everything since the last full, while the trans carries only the stuff since the last trans, potentially requiring numerous steps to recover to a certain point, while a diff can do it in one step, at the cost of more space used in the backup file.

Also, if one of the trans files is corrupt, everything after that is unusable, since trans log restores build on one another. If one of the diff files is corrupt, the one just after may well be okay. In a trans recovery chain, -every- file in the chain has to be okay. Restore from a diff file is not dependent on any other diff files.


Okay I get what you're saying. And yes, you're right; so your best bet WOULD be to do a nightly full and an hourly diff. Sorry, I was hung up on the learning part.


I'm not sure that's possible, as snapshots only include full backups. There may be a way to do this via a 3rd party tool; I'd have to look into this. (I'm pretty new to snapshots). You can use them for differentials, but you'll need the VSS interface.

I think my terminology was sloppy here. I meant a backup-restore point, of the sort achieved by one of the normal SQL backup operations, not something exotic.


Ah, okay. If you want a specific point in time, a simple T-SQL statement could do it, or you could create a job.



_____________________________________________________________________
-Jamie Scharbrough
MCTS: SQL 2008R2
Post #1374021
Posted Wednesday, October 17, 2012 1:56 PM
Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Yesterday @ 6:27 PM
Points: 3,135, Visits: 11,479
My preferred strategy has always been to make a nightly full backup and transaction log backups every 15 minutes, 24x7. Backup the system databases daily. I usually don’t bother with differential backups, since they do not support point in time recovery.

Depending on your space available, retain the full backups three or more days and the transaction log backup for at least three days. If possible, get the backup directories backed up to tape daily and rotated offsite for disaster recovery.

You can setup the backup maintenance plans to automate the whole thing for a server and be up and running it a few minutes.

If you are running a version of SQL Server that supports it, make sure you set all backups to be compressed to save time and space.

I used this strategy on an installation of over 300 SQL Servers with over 5000 databases and never had any problems with it.


Post #1374034
Posted Wednesday, October 17, 2012 2:10 PM


Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Sunday, August 24, 2014 3:38 PM
Points: 336, Visits: 942
AmarettoSlim (10/17/2012)
After reading the posts on this thread I'm sensing you already kinda know the acceptable thresholds of data loss based on experience in the company.

I do. It's the paleontology department of a museum, I've worked here for years and have had numerous discussions with the scientists and curators about data - integrity, potential loss, redundancy, recovery, etc. I know how they work with the database - I wrote everything myself, and often help them with the work, or sit back and watch to see how well I judged their needs and expectations, based on how easily they can operate something I designed, so I'm very familiar with the entire operation.

You're dealing with very small databases, the IO cost to back them up is not daunting to your servers performance I'm sure (but then again I don't know the specs and you should check this before hand, I'm basing this off the low MB numbers you've mentioned and semi-modern hardware).

Yes, this machine very lightly loaded, and all activity is during normal working hours. At night I can have it do whatever I want.

I propose based on what I've read:

Full Backup - Weekly (day with least SQL activity)
Differential Backup - Every 12 hours (excluding the time frame you take your full backup)

Is there some reason why you don't recommend that I do an hourly snap? I would like to do that, and particularly, I would like to do it 'on demand', that is, when the database is in use. No point in making repeated copies of something that isn't changing, but I want to try something like a 'last date/time of ANY database mods' value, and use that in a script to determine whether or not to do an hourly backup (trans or diff, I'm still fuzzy on that).

You haven't mentioned retention much so you'll want to determine that, don't forget about it or one day your drive will fill with the backups

I'll certainly keep an eye on that, but at the current rate of activity, the machine will probably wear out before I fill the drive.

From a recovery standpoint though it will be pretty simple, restore the latest weekly full backup and then restore the latest differential.

If you do the differentials at 12 AM and 12 PM (again, assuming that's acceptable regarding workload/performance) you could have a failure at 9 AM, and you restore back to midnight, or if its in the afternoon you restore back to 12 noon. While this doesn't give you point in time recovery based on your details it keeps the recovery well within acceptance. (DBs using this strategy should be in SIMPLE recovery model as Point-In-Time recovery is not needed, you will let the engine manage your .LDFs)

If I do it that way and use hourly differential backups, wouldn't I have almost effectively Point-In-Time, at least to the most recent hour? If the DB blows at 11:05, I can use the nightly full plus the 11am diff. If it blows at 10:55, I can use the nightly full plus the 10am diff. That gives me a very simple recovery model, with little potential data loss.

If your DBs start growing quickly, the IO costs are going to increase and you will need to re-adjust, keep that in mind. DR is never a set it and forget it deal. Also, it's imperative to test the backups once you've implemented your backup strategy.

They will increase somewhat in the coming year or two, but probably not by a great deal. In any case, I'm the only show in town. Whatever happens on the machine goes through me, so I have no trouble keeping on top of what's happening with it.

If you're wondering why, here is a great read:

Yes, that's a good article. I also attended SQL in the City in London this summer, where this sort of stuff was addressed. Grant Fritchey also has a good video tutorial about it here: http://www.red-gate.com/products/dba/sql-backup/version-7/#dbateam
Post #1374042
Posted Wednesday, October 17, 2012 2:36 PM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: 2 days ago @ 4:47 PM
Points: 192, Visits: 400
pdanes (10/17/2012)
AmarettoSlim (10/17/2012)
After reading the posts on this thread I'm sensing you already kinda know the acceptable thresholds of data loss based on experience in the company.



I do. It's the paleontology department of a museum, I've worked here for years and have had numerous discussions with the scientists and curators about data - integrity, potential loss, redundancy, recovery, etc. I know how they work with the database - I wrote everything myself, and often help them with the work, or sit back and watch to see how well I judged their needs and expectations, based on how easily they can operate something I designed, so I'm very familiar with the entire operation.


Great!

You're dealing with very small databases, the IO cost to back them up is not daunting to your servers performance I'm sure (but then again I don't know the specs and you should check this before hand, I'm basing this off the low MB numbers you've mentioned and semi-modern hardware).



Yes, this machine very lightly loaded, and all activity is during normal working hours. At night I can have it do whatever I want.


Nice

I propose based on what I've read:

Full Backup - Weekly (day with least SQL activity)
Differential Backup - Every 12 hours (excluding the time frame you take your full backup)



Is there some reason why you don't recommend that I do an hourly snap? I would like to do that, and particularly, I would like to do it 'on demand', that is, when the database is in use. No point in making repeated copies of something that isn't changing, but I want to try something like a 'last date/time of ANY database mods' value, and use that in a script to determine whether or not to do an hourly backup (trans or diff, I'm still fuzzy on that).


I was going for simplicity with a blend of recent data. If you want to do hourly diffs, I recommend doing daily full, hourly diffs. Purely because then recovery is again quick. Restore latest daily full, then most recent diff before the point of failure.

You haven't mentioned retention much so you'll want to determine that, don't forget about it or one day your drive will fill with the backups

I'll certainly keep an eye on that, but at the current rate of activity, the machine will probably wear out before I fill the drive.


From a recovery standpoint though it will be pretty simple, restore the latest weekly full backup and then restore the latest differential.

If you do the differentials at 12 AM and 12 PM (again, assuming that's acceptable regarding workload/performance) you could have a failure at 9 AM, and you restore back to midnight, or if its in the afternoon you restore back to 12 noon. While this doesn't give you point in time recovery based on your details it keeps the recovery well within acceptance. (DBs using this strategy should be in SIMPLE recovery model as Point-In-Time recovery is not needed, you will let the engine manage your .LDFs)



If I do it that way and use hourly differential backups, wouldn't I have almost effectively Point-In-Time, at least to the most recent hour? If the DB blows at 11:05, I can use the nightly full plus the 11am diff. If it blows at 10:55, I can use the nightly full plus the 10am diff. That gives me a very simple recovery model, with little potential data loss.


Correct!


If your DBs start growing quickly, the IO costs are going to increase and you will need to re-adjust, keep that in mind. DR is never a set it and forget it deal. Also, it's imperative to test the backups once you've implemented your backup strategy.

They will increase somewhat in the coming year or two, but probably not by a great deal. In any case, I'm the only show in town. Whatever happens on the machine goes through me, so I have no trouble keeping on top of what's happening with it.[quote]

Sounds good!

[quote]If you're wondering why, here is a great read:

Yes, that's a good article. I also attended SQL in the City in London this summer, where this sort of stuff was addressed. Grant Fritchey also has a good video tutorial about it here: http://www.red-gate.com/products/dba/sql-backup/version-7/#dbateam


I will certainly check it out, I follow Fritchey, Brent and others on Twitter

Cheers!
Post #1374055
Posted Thursday, October 18, 2012 12:40 AM


Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Sunday, August 24, 2014 3:38 PM
Points: 336, Visits: 942
Okay, I think I've got a reasonable plan. I'll do a nightly full, and an hourly differential and 15-minute transaction log during the day. The last will be mostly for practice on my part, with the expectation that the hourly differential will be the actual restore point in case of problems. If I can improve the situation with the 15-minute trans logs, great, but the hourly diff will be the primary go-to in the event of a problem. Compression of course, with off-site storage of the nightly copies, and I will also look at making all of it 'on-demand', i.e., only when the database is seeing some activity, which does not happen every day. (And yes, I WILL test-restore my backups occasionally. )

Many thanks to everyone for the input.

-----------

And an extra note, I've been looking at the 'on-demand' bit, so I don't make unnecessary backups of something that hasn't changed. I made this small test script that I will use as a starting point for that functionality - maybe someone else will find it useful:

declare @P datetime, @N datetime, @D int
set @P = (SELECT max(isnull(last_user_update,0))
FROM sys.dm_db_index_usage_stats
WHERE database_id = DB_ID( 'PaleoData'))
print @P
set @N = getdate()
print @N
set @D = datediff(n,@P,@N)
if @D < 15 print 'Less than 15, do trans'
if @D < 60 print 'Less than 1 hour, do diff'
if @D < 1440 print 'Less than a day, do full'

Post #1374164
Posted Thursday, October 18, 2012 7:42 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: 2 days ago @ 7:14 AM
Points: 7,197, Visits: 6,341
Nevermind. I didn't read your last post correctly.

Brandie Tarvin, MCITP Database Administrator

Webpage: http://www.BrandieTarvin.net
LiveJournal Blog: http://brandietarvin.livejournal.com/
On LinkedIn!, Google+, and Twitter.

Freelance Writer: Shadowrun
Latchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.
Post #1374363
« Prev Topic | Next Topic »

Add to briefcase ««123»»

Permissions Expand / Collapse