Automated backup plan

  • My preferred strategy has always been to make a nightly full backup and transaction log backups every 15 minutes, 24x7. Backup the system databases daily. I usually don’t bother with differential backups, since they do not support point in time recovery.

    Depending on your space available, retain the full backups three or more days and the transaction log backup for at least three days. If possible, get the backup directories backed up to tape daily and rotated offsite for disaster recovery.

    You can setup the backup maintenance plans to automate the whole thing for a server and be up and running it a few minutes.

    If you are running a version of SQL Server that supports it, make sure you set all backups to be compressed to save time and space.

    I used this strategy on an installation of over 300 SQL Servers with over 5000 databases and never had any problems with it.

  • AmarettoSlim (10/17/2012)


    After reading the posts on this thread I'm sensing you already kinda know the acceptable thresholds of data loss based on experience in the company.

    I do. It's the paleontology department of a museum, I've worked here for years and have had numerous discussions with the scientists and curators about data - integrity, potential loss, redundancy, recovery, etc. I know how they work with the database - I wrote everything myself, and often help them with the work, or sit back and watch to see how well I judged their needs and expectations, based on how easily they can operate something I designed, so I'm very familiar with the entire operation.

    You're dealing with very small databases, the IO cost to back them up is not daunting to your servers performance I'm sure (but then again I don't know the specs and you should check this before hand, I'm basing this off the low MB numbers you've mentioned and semi-modern hardware).

    Yes, this machine very lightly loaded, and all activity is during normal working hours. At night I can have it do whatever I want.

    I propose based on what I've read:

    Full Backup - Weekly (day with least SQL activity)

    Differential Backup - Every 12 hours (excluding the time frame you take your full backup)

    Is there some reason why you don't recommend that I do an hourly snap? I would like to do that, and particularly, I would like to do it 'on demand', that is, when the database is in use. No point in making repeated copies of something that isn't changing, but I want to try something like a 'last date/time of ANY database mods' value, and use that in a script to determine whether or not to do an hourly backup (trans or diff, I'm still fuzzy on that).

    You haven't mentioned retention much so you'll want to determine that, don't forget about it or one day your drive will fill with the backups πŸ˜‰

    I'll certainly keep an eye on that, but at the current rate of activity, the machine will probably wear out before I fill the drive.

    From a recovery standpoint though it will be pretty simple, restore the latest weekly full backup and then restore the latest differential.

    If you do the differentials at 12 AM and 12 PM (again, assuming that's acceptable regarding workload/performance) you could have a failure at 9 AM, and you restore back to midnight, or if its in the afternoon you restore back to 12 noon. While this doesn't give you point in time recovery based on your details it keeps the recovery well within acceptance. (DBs using this strategy should be in SIMPLE recovery model as Point-In-Time recovery is not needed, you will let the engine manage your .LDFs)

    If I do it that way and use hourly differential backups, wouldn't I have almost effectively Point-In-Time, at least to the most recent hour? If the DB blows at 11:05, I can use the nightly full plus the 11am diff. If it blows at 10:55, I can use the nightly full plus the 10am diff. That gives me a very simple recovery model, with little potential data loss.

    If your DBs start growing quickly, the IO costs are going to increase and you will need to re-adjust, keep that in mind. DR is never a set it and forget it deal. Also, it's imperative to test the backups once you've implemented your backup strategy.

    They will increase somewhat in the coming year or two, but probably not by a great deal. In any case, I'm the only show in town. Whatever happens on the machine goes through me, so I have no trouble keeping on top of what's happening with it.

    If you're wondering why, here is a great read:

    Yes, that's a good article. I also attended SQL in the City in London this summer, where this sort of stuff was addressed. Grant Fritchey also has a good video tutorial about it here: http://www.red-gate.com/products/dba/sql-backup/version-7/#dbateam

  • pdanes (10/17/2012)


    AmarettoSlim (10/17/2012)


    After reading the posts on this thread I'm sensing you already kinda know the acceptable thresholds of data loss based on experience in the company.

    I do. It's the paleontology department of a museum, I've worked here for years and have had numerous discussions with the scientists and curators about data - integrity, potential loss, redundancy, recovery, etc. I know how they work with the database - I wrote everything myself, and often help them with the work, or sit back and watch to see how well I judged their needs and expectations, based on how easily they can operate something I designed, so I'm very familiar with the entire operation.

    Great!

    You're dealing with very small databases, the IO cost to back them up is not daunting to your servers performance I'm sure (but then again I don't know the specs and you should check this before hand, I'm basing this off the low MB numbers you've mentioned and semi-modern hardware).

    Yes, this machine very lightly loaded, and all activity is during normal working hours. At night I can have it do whatever I want.

    Nice

    I propose based on what I've read:

    Full Backup - Weekly (day with least SQL activity)

    Differential Backup - Every 12 hours (excluding the time frame you take your full backup)

    Is there some reason why you don't recommend that I do an hourly snap? I would like to do that, and particularly, I would like to do it 'on demand', that is, when the database is in use. No point in making repeated copies of something that isn't changing, but I want to try something like a 'last date/time of ANY database mods' value, and use that in a script to determine whether or not to do an hourly backup (trans or diff, I'm still fuzzy on that).

    I was going for simplicity with a blend of recent data. If you want to do hourly diffs, I recommend doing daily full, hourly diffs. Purely because then recovery is again quick. Restore latest daily full, then most recent diff before the point of failure.

    You haven't mentioned retention much so you'll want to determine that, don't forget about it or one day your drive will fill with the backups πŸ˜‰

    I'll certainly keep an eye on that, but at the current rate of activity, the machine will probably wear out before I fill the drive.

    From a recovery standpoint though it will be pretty simple, restore the latest weekly full backup and then restore the latest differential.

    If you do the differentials at 12 AM and 12 PM (again, assuming that's acceptable regarding workload/performance) you could have a failure at 9 AM, and you restore back to midnight, or if its in the afternoon you restore back to 12 noon. While this doesn't give you point in time recovery based on your details it keeps the recovery well within acceptance. (DBs using this strategy should be in SIMPLE recovery model as Point-In-Time recovery is not needed, you will let the engine manage your .LDFs)

    If I do it that way and use hourly differential backups, wouldn't I have almost effectively Point-In-Time, at least to the most recent hour? If the DB blows at 11:05, I can use the nightly full plus the 11am diff. If it blows at 10:55, I can use the nightly full plus the 10am diff. That gives me a very simple recovery model, with little potential data loss.

    Correct!

    If your DBs start growing quickly, the IO costs are going to increase and you will need to re-adjust, keep that in mind. DR is never a set it and forget it deal. Also, it's imperative to test the backups once you've implemented your backup strategy.

    They will increase somewhat in the coming year or two, but probably not by a great deal. In any case, I'm the only show in town. Whatever happens on the machine goes through me, so I have no trouble keeping on top of what's happening with it.

    Sounds good!

    If you're wondering why, here is a great read:

    Yes, that's a good article. I also attended SQL in the City in London this summer, where this sort of stuff was addressed. Grant Fritchey also has a good video tutorial about it here: http://www.red-gate.com/products/dba/sql-backup/version-7/#dbateam

    I will certainly check it out, I follow Fritchey, Brent and others on Twitter

    Cheers!

  • Okay, I think I've got a reasonable plan. I'll do a nightly full, and an hourly differential and 15-minute transaction log during the day. The last will be mostly for practice on my part, with the expectation that the hourly differential will be the actual restore point in case of problems. If I can improve the situation with the 15-minute trans logs, great, but the hourly diff will be the primary go-to in the event of a problem. Compression of course, with off-site storage of the nightly copies, and I will also look at making all of it 'on-demand', i.e., only when the database is seeing some activity, which does not happen every day. (And yes, I WILL test-restore my backups occasionally. πŸ™‚ )

    Many thanks to everyone for the input.

    -----------

    And an extra note, I've been looking at the 'on-demand' bit, so I don't make unnecessary backups of something that hasn't changed. I made this small test script that I will use as a starting point for that functionality - maybe someone else will find it useful:

    declare @P datetime, @N datetime, @D int

    set @P = (SELECT max(isnull(last_user_update,0))

    FROM sys.dm_db_index_usage_stats

    WHERE database_id = DB_ID( 'PaleoData'))

    print @P

    set @N = getdate()

    print @N

    set @D = datediff(n,@P,@N)

    if @D < 15 print 'Less than 15, do trans'

    if @D < 60 print 'Less than 1 hour, do diff'

    if @D < 1440 print 'Less than a day, do full'

  • Nevermind. I didn't read your last post correctly.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • This post is not in disagreement with any prior posts, just offering a perspective based upon my own experiences.

    The choice between the Full and Simple database recovery models usually comes down to a choice between downtime and data loss. Which can your employer tolerate the best? It's a business decision, not a technical one.

    In general, the Full recovery model will recover more data but take longer for your database to be brought back online.

    In general, the Simple recovery model will lose more data but take less time for your database to be brought back online.

    In my last 2 jobs, my employers both used the Full recovery models when I was first hired. I posed questions to company management and found that downtime was more important than data loss. So, I implemented the Simple recovery model.

    As an example, my implementation was straightforward:

    1. One full database backup of all databases every night.

    2. Three differential database backups of all databases taken every day, spaced 6 hours apart.

    3. Database backup file retention was 1 week.

    This guaranteed the company, if the database backups had integrity, no more than a 6 hour loss of data, which was acceptable to management.

    My last employer purchased a SAN. It had the ability to perform synchronized database data and log file snapshots, guaranteeing database integrity when restored. I created a database snapshot schedule that performed snapshots every 4 hours, with a 1 week retention.

    The company would have liked to have had a longer database backup and snapshot retention period, and more frequent snapshots, which I specifically recommended, but the limiting factor was available disk space, so we agreed to a 1 week retention.

  • Hey, Lee, thanks for your input. It's always good to hear different perspectives, especially when those thought are accompanied by sound reasoning and underscored by experience. In my case, data loss is considerably more important than downtime. Some of either is tolerable, but there is lots of work for people to do that doesn't require the database, and having to do something over is a massive pain in the fundament, so a database offline is vastly preferable to redoing work already done once. That's why I went with the nightly full, hourly diff and 15-minute log. Not that up-to-the-minute data is somehow critical, but more to keep the natives from getting restless.

    I'm also fortunate that I have vastly underutilized hardware, so I can afford to keep my backup copies 'until further notice'.

Viewing 7 posts - 16 through 21 (of 21 total)

You must be logged in to reply to this topic. Login to reply