Archive Backups

  • Comments posted to this topic are about the item Archive Backups

  • Storage prices are low. With that said however, if it really is a problem and your not in the financial business with it's requirements then perhaps you can backup every now and then, check that the backup is ok and then discard the previous one and in addition also save one per quarter. That would feel a lot better than the solution they use as I understand it...

  • Outside of the SQL Server world, deduplication of data in backups is big business. It is the way that large organisations are making $M savings on storage costs.

    But this is hard to do with SQL Server backups, unless the backup is done by a backup agent. But most SQL Server folk do not like backup agents because historically they are unreliable when it comes round to doing a recovery.

    The lack of ability to deduplicate the data in the SQL backups is maybe a major reason behind the proposal to just do a quarterly backup. In theory you could do daily differential backups to a quarterly full backup, but the management of this amount of backup to me looks horrendous.

    We need Microsoft and other backup vendors to come up with a solution that allows deduplication of SQL backup data, otherwise the storage costs for unduplicated SQL Backups will start to become a reason for avoiding SQL Server.

    An idea that is used by another DBMS is called 'offline backup merge'. This takes a full backup and a differential backup (or a chain of log backups) as input, and produces a new full backup showing the state as at the end of the merged backup. This can be done on any server, with the merge process connecting to the original DB instance at the end of the work to register the new backup. This process is not the full data deduplication that is possible with block-level backups, but it can help simplify database recovery for slowly changing data.

    Original author: https://github.com/SQL-FineBuild/Common/wiki/ 1-click install and best practice configuration of SQL Server 2019, 2017 2016, 2014, 2012, 2008 R2, 2008 and 2005.

    When I give food to the poor they call me a saint. When I ask why they are poor they call me a communist - Archbishop Hélder Câmara

  • Steve,

    Presumably, the data is important otherwise it wouldn't be saved. The key part to your editorial for me is that the images don't change often. But, they change. How would you know if anything changed between quarterly backups; or even monthly backups. If the data is important enough to retain, and it changes, it is important to make the investment to back it up daily - maybe weekly at the outside.

  • I didn't see any discussion of how frequently data was being added to the archive store; that's an absolutely critical point, especially if there comes to be a gap: quarterly backups of archive, backups of the main store kept for, say, only 8 weeks... a problem just before the (not unlikely to be delayed; after all, it's "read only") quarterly backup and you completely lose at least 4 weeks of data from the first part of that quarter.

    Further, I disagree about a quarterly full and daily differentials being a nightmare to manage; you're talking about less than a hundred backup files. If we can manage millions of rows in hundreds of tables, then surely a hundred files in a single directory path isn't that difficult if you take the time to set up management, validation, and restore test protocols in the first place.

  • I's not difficult to manage the files, but in a crisis, it can be an issue. If you don't have the data on hand on the same backup tape, it can be an issue. In DR, things break. It's not a question of having the file(s), it's being able to put your hands on them when it matters.

  • I think the root problem is testing your DR solution. If you rely on tapes for DR then they should be tested (restored) on a regular basis as part of regular DR testing. If tapes are purely for archival (ie legal reasons) then maybe there's less motiviation to test them regularly.

  • There are lots of ways to deal with this. There are tons of cheap tools that, when used together, can handle this quite well. I can get a terabyte (or better) external drive cheap, use XXCOPY to handle adds, drop, and changes, then take the drive off-site and stick it in a bank vault.

    Then you could have a backup server. Use that to store your backups of pictures, documents, read only (or mostly) stuff, and the like. Then use a service like Barracuda or Carbonite to take care of your off-site. for slowly changing stuff these kind of services seem ideal.

    For SQL you can do command line backups using OSQL. Compress the backup with 7-Zip (it's free). Move the compressed file to your repository server and let the off-site service do its work while you sleep. Wrap the whole thing up in a command (batch) file and let the Windows Task Scheduler run it for you.

    As for the restore, well you grab one of the files and do a practice restore. You are auditing and confirming your backups now, right? No? Stop reading this and go fix that problem now. Your minimum should be to:

    (1) Stop activity.

    (2) Do your backup.

    (3) Restore to a second machine.

    (4) Use your redgate tools to compare the SQL objects and the data.

    ATBCharles Kincaid

  • 80-90% under 2gb? No one has made a comment on that statement?

    I assume that 80-90% isn't JUST Sql Server databases. If you include Access and all the other end-user databases out there, then yes, I agree. Access can't handle anything of any size, so that's probably the cause.

    I just have a hard time believing that 80-90% of the Sql Server databases out there are und 2gb. Hell, I would guess that more than 80-90% are OVER 10gb. That's just my opinion though.

    I handle about 50 databases between 2 servers and I would say that 80-90% are above 20gb.


    Live to Throw
    Throw to Live
    Will Summers

  • If the data truly is archive data, then my immediate assumption would be that the data is on a different SLA than the rest of the database(s). If the archive is on a different tape or is accessible via some other means, then that may be acceptable. It may be fully acceptable to wait 24 hours for it to be restored (24 hours beyond the other databases).

    I would make sure there were multiple copies of the backup to restore from, however.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • I saw a survey last year that listed SQL Server databases, and like an idiot I didn't save it. It might have been 5GB, but it was single digit GBs.

    I bet that most databases are that small, if you are counting numbers. I had tons of them at JD Edwards that were under 1GB. It seemed every web app, every little department app had a small database (and unfortunately instance) attached to it.

    We have 5 dbs running this site, 1 <2GB, 3 are right at 2GB, 1 is 6GB.

    There are lots, and lots of companies that are storing text (orders, sales, customers, etc) and numbers, all of which as small. Plenty of people I know build databases that are < 1GB to start with. They just don't anticipate the growth that will exceed that.

  • It could be that people are forgetting about CE. Take our Sales and Delivery Force automation (part of our Scout Suite). Our main database sits between the client back end and and the mobile device in the field. Most of our clients get good performance on a capable machine running Express. There's your 4 GIG limit. Each device runs a copy of SQL Server Compact Edition. Some of our clients have several hundred users out in the field. The device has 128 Meg of total storage so the database is under that. It's real SQL Server.

    So if you are counting numbers you have to include that. It brings down the average. The more successful I am the more the average come down. Sorry.

    ATBCharles Kincaid

  • Under 5gb? I can't imagine how easy it would be to manage databases of that size. I don't think I would ever have to deal with a phone call of someone not getting near-instant response times. I deal with tables over 150 million rows, some tables themselves are nearly 20gb.

    I have views that join 20+ tables and I would say that at least 25% of those tables in the view are minimum 10gb a piece. I guess I'm dealing with large databases then.

    I was thinking that I was on the low end on database sizes. Hmph.


    Live to Throw
    Throw to Live
    Will Summers

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply