Plan for Failure

Notice I didn't title this "Plan to Fail." We should never plan to fail. That's sabotage. And that ain't right, as we say in the South.

However, when we do our planning, it is an unrealistic expectation for everything to work every time, especially when it comes to IT. Our systems are getting more and more complex with each passing year and there are more and more points where a failure can occur. So it is realistic to plan for failure. After all, that's what recovery, and especially disaster recovery, is all about. Take, for example, the bridge in the picture (Photo: NOAA). This is the Ben Sawyer Bridge and this is what it looked like after Hurricane Hugo got a hold of it. Needless to say, in this state, it wasn't usable for car traffic. This blocked off the only accessible means to Sullivan's Island for normal land transportation.

When we architect systems and processes, we don't want to be in the same sort of situation. We want to make sure that if there is a failure, our systems will handle such gracefully. The only way they can is if we plan for failures to occur. And as we plan, we should consider the advice of those who have gone before us. For instance, the 8 Fallacies of Distributed Computing.

Fallacy #1 is an important one: the network is reliable. What that's basically saying is it is a fallacy to assume that every time you plan on using the network, it is up. I have seen cases where someone was working on a server in a rack and the network cabling for a different server is affected. Sometimes this isn't obvious at all. For instance, the cable looks like it's still plugged in to the back of the server, but things are just loose enough where good contact isn't being made. So now we have a physical fault and the network, at least as far as that one server is concerned, is not available. Or it could be the case like a few years ago where there were some issues with some of the Broadcom NIC drivers and we were affected. In our case, the loss of network connectivity couldn't be predicted. Everything worked okay and then, *blip* the NIC was off-line. What made matters worse was as far as the OS was concerned, everything looked fine. Now the fix was a simple one: log on locally and disable and re-enable the NIC or simply reboot the box. However, that did mean someone had to log on locally. Unfortunately, that KVM wasn't network enabled at the time.

So planning for failure is a proper part of the architecture design. Another would be to look to minimize the chances of failure. For instance, looking at fallacy #1, if I can get to a point to where a network failure doesn't affect me or I can minimize the damage done, all the better. This may simply be a case of copying the flat file extract from the mainframe to the system holding SQL Server where the SSIS package is going to run. Sure, a network failure during the copy could prevent me from starting the data import, but that's a lot better than a network failure occuring during the data import because I'm importing said file across the network. The first case is easy to deal with. I restore the network connectivity, get the file copied over, and start the ETL process. In the second case, I've got to restore the network connectivity, but I also quite likely have to do some data clean-up, depending on how far along I was when the failure occurred.

So plan for failure. And look to minimize the impact of failure. Two general steps to include in architecture design.

Book Review: Big Red - Voyage of a Trident Submarine

by Andy Warren

SQLServerCentral.com

Blogs

I've grown up reading Tom Clancy and probably most of you have at least seen Red October, so this book caught my eye when browsing used books for a recent trip. It's a fairly human look at what's involved in sailing on a Trident missile submarine...

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-03-10

1,439 reads

Database Mirroring FAQ: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup?

by Robert Davis

SQLServerCentral.com

Blogs

Question: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup? This question was sent to me via email. My reply follows. Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup? Databases to be mirrored are currently running on 2005 SQL instances but will be upgraded to 2008 SQL in the near future.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-02-23

1,567 reads

Inserting Markup into a String with SQL

by Phil Factor

SQLServerCentral.com

T-SQL

In which Phil illustrates an old trick using STUFF to intert a number of substrings from a table into a string, and explains why the technique might speed up your code...

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-02-18

1,631 reads

Networking - Part 4

by Andy Warren

SQLServerCentral.com

Blogs

You may want to read Part 1 , Part 2 , and Part 3 before continuing. This time around I'd like to talk about social networking. We'll start with social networking. Facebook, MySpace, and Twitter are all good examples of using technology to let...

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-02-17

1,530 reads

Speaking at Community Events - More Thoughts

by Andy Warren

SQLServerCentral.com

Blogs

Last week I posted Speaking at Community Events - Time to Raise the Bar?, a first cut at talking about to what degree we should require experience for speakers at events like SQLSaturday as well as when it might be appropriate to add additional focus/limitations on the presentations that are accepted. I've got a few more thoughts on the topic this week, and I look forward to your comments.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-02-13

360 reads

Plan for Failure

Rate

Share

Share

Rate

Plan for Failure

Rate

Share

Share

Rate

Related content

Book Review: Big Red - Voyage of a Trident Submarine

Database Mirroring FAQ: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup?

Inserting Markup into a String with SQL

Networking - Part 4

Speaking at Community Events - More Thoughts