Safety in Numbers

  • john.w.walker (2/12/2010)

    No, 2 days of full backups are not enough. There was a hack that made rounds 7-8 years ago that got everyone over a holiday weekend. At a minimum, you need to make sure you have enough days backups to cover the longest period your company isn't working and you might not know you have been bitten.

    Good point. I haven't had an issue over a holiday weekend in a long time, though to be fair, lots of places I've worked have had people working on weekends most of the time (usually in another country).

  • Alan Vogan (2/12/2010)

    Disk space is cheap! We keep a log on database growth and when we need more disk space, we order it.

    I've thought this for years, but when you see SAN disk prices, and sometimes mandates to move all disk there, it's not so cheap. It's not easy to get approval to order when that cost becomes a significant line item and comes to a CPAs attention.

    If you can use DASD, this is easier, but even for backups you are potentially risking things if you get a bad drive. I've usually made sure my backups are in R5.

  • Steve Jones - Editor (2/12/2010)

    Alan Vogan (2/12/2010)

    Disk space is cheap! We keep a log on database growth and when we need more disk space, we order it.

    I've thought this for years, but when you see SAN disk prices, and sometimes mandates to move all disk there, it's not so cheap. It's not easy to get approval to order when that cost becomes a significant line item and comes to a CPAs attention.

    If you can use DASD, this is easier, but even for backups you are potentially risking things if you get a bad drive. I've usually made sure my backups are in R5.

    You got me thinking so I went and talked to our sys admin. We have a SANS array 1.5 TB , RAID5 and 18 swappable drives. I think there is room for expansion and the drives are 72GB a piece. I just did the math and that's about right.

    I don't know how much that cost, however, besides having these backups go to tape and offsite, we also have a WD Sharespace sitting in my workarea, over 2TB of RAID5 storage for $600. We're using this for redundancy.

    Anything under $1000 and we're flying under the radar.

    So, is this good? Is it overkill? Is it convoluted? Let me know

  • If it works for you, I think you're OK. I don't know anything about that product, but if it's Raid protected and works, then it's a good idea. The thing is that often those machines aren't as reliable and if there is an issue, you might end up replacing the whole thing. If you can easily get the $$ to do it, that's great. However if you couldn't go buy one in a day or two, what does that do to your process?

    Many larger corporations I've worked in don't have a quick approval process for things over $250. So we can't depend on something like that. That's opposed to a larger server that's supported by the vendor and we can get replaced under a service contract in a day.

  • I keep 8 day's worth of production backups, and fewer on test and Dev machines.

    Only the other day I had to recover some objects that were accidentally delted... and the only copy left was on the backup from 8 days ago.

  • I keep about a month's worth of backup both off-site and locally. Lucky for me, our databases are small 🙂


    Kevin C.

  • 80% of my backups are streamed off to a Tivoli Storage System (TSM) via a SQL aware client. Generally FULLs are run on the weekends and daily or mutilple-times-per-day LOG are streamed off. These backups are generally out of my control but our TSM admin keeps a minimum of 30 days of backups.

    The remaining 20% of backups sit on local disks and those disks are backed up daily to the same TSM system. An important distinction is the the first set mentioned above are restored by a SQL aware client and the latter set would be used to restore a whole disk volume - no point-in-time recovery.

    Then I have one really odd duck - a corporate critical application that is curently running on a unclustered server (don't ask!). It is backed up FULL every 12 hrs and LOG backups taken every hour except for the FULL hours. A special TSM schedule pulls those backup off every hour.

    As to question of why one would keep the intermediate LOGs between FULLs - how else could you get back to a specific point-in-time between the two FULLs? I have lost count of the times that I have had to restore a database to a specific point-in-time INTO ANOTHER TEMPORARY database just to extract the contents one table that some power user had mangled beyong recognition. These requests usually start with a phone call from a user " it possible to bring back table XYZ as of Tuesday last week at 2:00 PM?"

  • We keep two days of full backups and transaction log (varies 1 to 3 hour backups) files on-server, with entire backup directory backed up to tape nightly. Maintenance plans all purge older than 2 days.

  • GSquared (2/12/2010)

    For me, it depends on the data loss policies of the company, and the ability to recover data.

    For example, I used to manage a database that was just a tool for managing and processing mailing lists. Since the lists themselves existed outside the database, and could be reimported and reprocessed pretty easily, and since the database itself was huge (terrabytes), I didn't bother with anything beyond a weekly full backup and daily diff backups.

    Our reporting data looks something like this. Huge amounts of data in simple recovery mode with full backups done once or twice a week and incrementals on the other days. We keep only 1 set of backups at any time as the last day or 2 could be regenerated by our processes if needed.

    I believe it's 60 days worth kept on tape by our support folks. Incidently, they complained not how much space these tape backups use but the length of time it takes to move them off when our backup files are too big. It prompted us to split our backups to anywhere from 5-15 files per backup.


  • We keep three days of full backups (one per night) and log backups (taken at 10 min intervals for core databases, 1 hour intervals for others) in a share. We keep the historic log backups in case we need to restore to a particular time to help with troubleshooting if any application support issues arise.

    Backups are performed across the network from all instances to a single share with instance-named subfolders on a SAN, the share being exposed through a VM. The entire SAN is mirrored to an identically-specced SAN at a different site.

    For the core databases the full backups are restored to a 'training' environment immediately after they are created, then all sensitive data is removed and the training databases backed up to a NAS device for adhoc use by developers. A useful side-effect of this process is that it allows us to prove the integrity of the core databases' backups. All non-core backups are tested on a rotational basis.

    The share is then backed-up to tape and the tape(s) retained for 2 weeks, and on the second of every month we take all full backups from the first of the month and retain those on tape for 12 months.

    We also mirror the core databases to an identically-specced server, which is used to serve data for reports through the gift of database snapshots (well, they're not really a gift as using them requires a second set of server CPU licenses).

    We also create a snapshot of the core database on the live server at 06:30 every morning to allow us to manually recover from any DML or DDL change 'accidents' that occur during the day, without having to restore the entire database.

    We did initially meet resistance when discussing disk space requirements, however a justification followed and the budget was agreed.


  • As always, It Depends. For some systems, I keep only the most recent full backup and the log backups since. Other more critical systems have several days worth of both. For a few systems (financial accounting, etc.), I keep several days of backups in addition to several years worth of month-end backups in case we need to retrieve point-in-time fiscal data for a specific financial period (which has happened a few times over the years).

    Tim Mitchell, Microsoft Data Platform MVP
    Data Warehouse and ETL Consultant | @Tim_Mitchell |
    ETL Best Practices

  • It varies from databse to database. There are in-house databases and databases at the customer sites, databases that change rapidly and databases that change rarely and databases that are read only, databases for OLTP and databases for MIS, large databases, small databases, and very small databases. I suspect I'm a good deal more cautious (or maybe just more paranoid) than Steve. These are the databases we had and what we did about backup and recovery; I believe the company is still doing the same.

    1) At each customer site: rapidly changing database with current and recent information only, where recovery to point in time is essential. These had (a) transactional replication to another server which could take over in an emergency (MS agreed that this server needed no Sql licences, since from the SQL point of view it was pure standby) and (b) full recovery model with full backup daily and backup log every 15 minutes, full backups (held on yet another server) retained for 4 weeks and log backups while the full backup they were based on was retained.

    2) At each customer site: archive database containing history of activity in the above database, used to produce management information. (a) transactional replication as above, plus (b) simple recovery model with full backup twice per month, each backup overwrites the last but 4, held on same server as backups from (1) above. (This is expensive on space, but safe)

    3) At each customer site: small database (describing the customisation of our product for the particular customer) that is updatd very rarely by customer or by our customer support team, and rather more often by our release and implementation people (hope that update by the customer becomes in time the most frequent update). Full recovery model, daily full backup, twice daily log backup, backups kept for 4 weeks - this may seem OTT, but the normal restore case for this is that the customer changed his mind about some changes he made AFTER telling the test and "commit" system that the changes are all OK, and that happens any time from a few hours to a couple of weeks after the change was committed.

    4) At each customer site: small databases that are read only at customer site except when replaced completely by us: a copy is placed on the replication target server when ever one of these databases is replaced, and a full backup (to the server handling backup filestore) is done at the same time and kept until next time the db is replaced.

    5) In-house databases corresponding to 4 above: simple recovery model, daily full backup, backups kept for 3 months. (This is not expensive on space because these databases are very small.)

    6) In-House: Prototype databses corresponding to (1) and (2) above (we trasfer these "empty" databases to new customers as part of the installation process. Full or simple recovery model (depending on whether it's from (1) or from (2) above) with same backup regime as at (1) or (2) but no replication. The backup media for these (and for 5 above and 7 and 8 below) is backed up daily as part of a large filestore backup, and these filestore backups are kept for a couple of weeks off-site.

    7) In-house: databases like that at (3) for new customers (system as yet uninstalled, or installed but not yet accepted) in which the creative team is doing the customisation required for the particular customer: full recovery model, daily full backups, twice daily log backups, kept for 4 weeks.

    8) Copies from customer sites of databases described at (3) to be used in testing/validation/QA. Simple recovery model, backed up once a week, kept for 3 weeks.

    I made sure to do at least one recovery from backup per year into dummy databases at each customer site, and recover all in-house databases (to dummy) from backup at least three times per year. The backups might well be useless if recovery wasn't checked regularly.


Viewing 12 posts - 31 through 41 (of 41 total)

You must be logged in to reply to this topic. Login to reply