SQL Clone
SQLServerCentral is supported by Redgate
Log in  ::  Register  ::  Not logged in

Table Backups and the TableArchive Database

A discussion of how a separate database is used for storing table backups.

In a recent article, one of the things I mentioned was saving table backups before important and/or risky processes. For example, I generally do this before truncating the next year school assignment and school waitlist tables and reloading the new assignments/waitlists. Sometimes these backups are removed soon after, but some backups are kept on-hand for a few months or a year in case there’s ever a question down the line about what the input data was when we ran the process.

As I discussed, the table backups generally are given a name with the format: [TableName]_yyyymmdd_[ReasonToExist]. But where do I put this type of table? The answer is a database specifically created for such backups. At Boston Public Schools, this is the TableArchive database.

When I first started at Boston Public Schools, it was common practice to put such backups directly onto the main production database. This might be a reasonable practice if the number of tables in the database is under 40-50. When you get to hundreds of tables, having the backups on the main production database is simply messy. Because of this, I created TableArchive.

One problem with having backup tables on the main production database is it causes bloat – we have copies of production data on a number of servers, backups of the production databases going back several days, etc. Including these never-changing tables in these backups simply causes backup/restore processes to take longer and chews up valuable server disk space where one backup of the TableArchive database is likely sufficient.

There are other issues as well storing these tables on the main production database. It makes finding the tables with real data more difficult as you need to wade through a number of mixed in backup tables. It makes cleanup of the backups more difficult – there are more tables to look through when you want to clear out the old, unnecessary backups. Also, it is simply a better database server organization separating these tables into their own database.

We recently did an analysis of our main production database – more than 60% of the 500+ tables were either table backups [some from many years ago] or were obsolete. We’re in the process of removing tables or placing them into TableArchive as appropriate. If we can shave several gigabytes off the backup size and significantly reduce the number of tables we have, the cleanup effort will be a success.

One consideration I originally had was to set this database up on a separate server, not on the main production server. At the time, we did not use linked servers at BPS, so I decided to place the TableArchive database on the main production server. The rationale was the backup tables should be query-able – we might want to write queries to make comparisons between the backup and the actual production version of the table. Our newly hired DBA has linked some of the servers together, and therefore there’s a possibility we might move the TableArchive database off the main production server entirely. At this point doing so is low priority, but I’ll post an update if this occurs.

So how do others handle table backups? Do others do something similar to what we’re doing at BPS? Do others rely solely on entire database backups? Are the table snapshots directly on the production server? Maybe some development groups never need to make backups of individual tables? Please leave a comment and let me know.


Posted by Jon Russell on 20 June 2011

I do not “backup a table” to organize my data by a specific time period. You should be able to query the time period with a good database design. The only reason I see to do this is for performance reasons for very large data sets, so the database only has to look through active data, instead of very old data, but I would welcome other opinions on this.

Sometimes I will make a copy of a table in cases when I may need to roll back and update.  In such cases, I would just leave table in the production database for a couple of weeks, or however long is necessary. In no circumstance does anyone have access to this backup table. To me, these tables are temporary, untested, and they usually do not have any indexes on them. Running a query against them could cause a problem with the production database and server. If anyone needs access to the “backup table” I will restore recent backup to a development server, which I have a job that runs every night to do this anyway. If anyone needs real-time data, I will publish the table with Transaction Replication to it can be queried on a reporting database.

Posted by Jon Russell on 20 June 2011

Steve Jones posted a good article today relating to this. www.sqlservercentral.com/.../74354

Leave a Comment

Please register or log in to leave a comment.