Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


'Best' HA/DR option to protect against data corruption


'Best' HA/DR option to protect against data corruption

Author
Message
msandico 57892
msandico 57892
SSC Rookie
SSC Rookie (25 reputation)SSC Rookie (25 reputation)SSC Rookie (25 reputation)SSC Rookie (25 reputation)SSC Rookie (25 reputation)SSC Rookie (25 reputation)SSC Rookie (25 reputation)SSC Rookie (25 reputation)

Group: General Forum Members
Points: 25 Visits: 195
GilaMonster (7/29/2014)
Start with regularly tested backups and frequent database integrity checks. You should be running integrity checks often enough that it's caught before ending up in the backups. Or, if there isn't a large enough maintenance window, restore the backups to another server and CheckDB that.

Second, brush up on restore options. A 1TB database is probably in multiple files/filegroups. If the corruption was isolated, you should have been able to restore just the affected file/filegroup. Usually a lot faster than restoring the entire DB and if you have SQL enterprise edition, the restore is 'online' (rest of the database is online and usable). Depending on the extent of the damage, you may have been able to do page restores, restoring just the corrupted pages.

Once that's in place, then start looking at HA/DR that support your SLAs.


Thanks Gila, as this is a third-party application there is only 1 filegroup (PRIMARY), and it contains 10 years worth of data, therefore it's size. I will brush up on restore options however.

Also, as you mentioned, I also have asked our team to investigate the feasibility of increasing our checkdb from weekly to daily (this one takes ~4hrs to run), but we have 100+ other SQL servers that are smaller in size and could run in an hour or less.

Also, the extent of the damage was pretty bad - over 3500+ consistency errors spread amongst 10 tables. When we finally recovered , I compared my checktable pagecount to the checkdb pagecount initially for those 10 tables and the difference was about 36k pages difference in the restored db versus the corrupted database (so I assume if I did repair_allow_data_loss option I may have lost 36k pages!).
GilaMonster
GilaMonster
SSC-Forever
SSC-Forever (47K reputation)SSC-Forever (47K reputation)SSC-Forever (47K reputation)SSC-Forever (47K reputation)SSC-Forever (47K reputation)SSC-Forever (47K reputation)SSC-Forever (47K reputation)SSC-Forever (47K reputation)

Group: General Forum Members
Points: 47204 Visits: 44367
I said nothing implying you should have repaired...

See if you can get some automated restore tests set up. Restore backups on another server, run CheckDB. Off the prod server means it doesn't matter how long they take, does mean you will know if your backups are good.

See if there's any way you can archive older data into tables on another filegroup or another database entirely. Generally with a DB like that, the active portion is tiny compared to the entire DB.


Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass


msandico 57892
msandico 57892
SSC Rookie
SSC Rookie (25 reputation)SSC Rookie (25 reputation)SSC Rookie (25 reputation)SSC Rookie (25 reputation)SSC Rookie (25 reputation)SSC Rookie (25 reputation)SSC Rookie (25 reputation)SSC Rookie (25 reputation)

Group: General Forum Members
Points: 25 Visits: 195
GilaMonster (7/29/2014)
I said nothing implying you should have repaired...

See if you can get some automated restore tests set up. Restore backups on another server, run CheckDB. Off the prod server means it doesn't matter how long they take, does mean you will know if your backups are good.

See if there's any way you can archive older data into tables on another filegroup or another database entirely. Generally with a DB like that, the active portion is tiny compared to the entire DB.


Ah I should have mentioned we had multiple 'solutions' occurring in parallel and repair_allow_data_loss was one of them as an absolute last resort.

And yes, I managed to archive one table (yippee) into another database last year but their biggest table, an audit table, taking 300GB out of the 1TB, cannot be touched according to the business, they want all history in there..and accessible if needed..ugh.

And yes (again), doing more frequent checkdbs is definitely being looked at..thanks again everyone.
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search