Flexibility in DR

Disaster recovery is a topic that many people who work with databases find important. DBAs work to ensure their systems are highly available while practicing the skills needed to recover them in the event of a disaster. While we often practice the failure of a complete server or database, we sometimes forget there can be other issues for which we need to plan.

I ran across a story about a data center outage in London, which was similar to an incident I experienced. In this case, a piece of power equipment failed and systems lost network access. While many of us aren't responsible for the network, we certainly would still receive complaints if the database wasn't available.

In my situation, I was in a data center watching some workman service UPS systems. At the time I was a manager of database systems and thought they had things under control. I left the data center and went downstairs, only to audibly notice when the power dropped that all servers seemed to stop running. Similar to the London outage, a switch that was supposed to change power from one set of UPSes to the other failed. Our entire global infrastructure for 10,000+ people and who knows how many customers went down. This was in the pre-cloud era where we acted as our own cloud. And not very well.

There are systems outside of our databases that we depend on. We might not be responsible for them, but we ought to ensure they are a part of our disaster recovery plans and account for various things going wrong. At the very least, we ought to question whether power, network, storage, and more are adequately prepared for major issues.

We also need to think about minor issues. Atlassian had a major outage, at least for some customers, and they realized that they hadn't planned to recover parts of their databases. A similar issue might occur for any of us, where we might have to restore parts of a database, whether that's a table, partition, or a series of rows. Corruption or human error might result in a set of data that's unreadable or even gone. I know I've accidentally caused data issues, and I'm careful. I learned to recover from my own mistakes and anticipate those of others. I practice not only full restores but how to copy over part of table from another location.

A disaster is a major problem, but it might only affect a minor part of our systems. We need to ensure we are ready for any size or scale of problem and be ready to adapt our thinking and process to meet the disaster with the appropriate actions.

Incident Response Data

by Steve Jones

SQLServerCentral

Being prepared for a disaster might mean having a way to collect data when something occurs.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

5 (1)

You rated this post out of 5. Change rating

2021-05-12

188 reads

Discuss

Impact Minutes

by Steve Jones

SQLServerCentral

Disaster Recovery (DR)

When downtime strikes, we may have to make decisions about which systems to focus our efforts upon. Steve talks about the impact of a disaster on your choices.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

5 (2)

You rated this post out of 5. Change rating

2021-01-28

74 reads

Discuss

Recovering Databases From a Master Backup

by Steve Jones

SQLServerCentral

Losing your instance might result in the need to get information from what you have. Steve Jones looks at a way to get the proper version and patch, and database list, from what limited resources you might have.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

5 (5)

You rated this post out of 5. Change rating

2020-10-27

2,072 reads

Discuss

Make SQL Server Agent Jobs HADR Aware

by Steve Rezhener

SQLServerCentral

Introduction Always On Availability Groups (AGs/AG...

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

4.33 (3)

You rated this post out of 5. Change rating

2020-10-22

8,564 reads

Discuss

DR as a Service

by Steve Jones

SQLServerCentral

Disaster Recovery (DR)

It's not the first task when I start a new job, but often as a DBA or developer, I usually ask about Disaster Recovery (DR) plans sometime within the first six months. If I'm a DBA, of course I need a plan. If I'm a developer, however, I still need to understand how this might […]

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2020-10-01

102 reads

Discuss

Flexibility in DR

Rate

Share

Categories

Share

Rate

Flexibility in DR

Rate

Share

Categories

Share

Rate

Related content

Incident Response Data

Impact Minutes

Recovering Databases From a Master Backup

Make SQL Server Agent Jobs HADR Aware

DR as a Service