Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Downtime


Downtime

Author
Message
Steve Jones
Steve Jones
SSC-Forever
SSC-Forever (40K reputation)SSC-Forever (40K reputation)SSC-Forever (40K reputation)SSC-Forever (40K reputation)SSC-Forever (40K reputation)SSC-Forever (40K reputation)SSC-Forever (40K reputation)SSC-Forever (40K reputation)

Group: Administrators
Points: 40695 Visits: 18851
Comments posted to this topic are about the item Downtime

Follow me on Twitter: @way0utwest
Forum Etiquette: How to post data/code on a forum to get the best help
My Blog: www.voiceofthedba.com
Gary Varga
Gary Varga
SSChampion
SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)

Group: General Forum Members
Points: 10189 Visits: 6345
I once had a role commissioning a set of servers and associated hardware in, at the time, what was Europe's biggest data centre. The servers housed various internet and intranet offerings that were critical to the company. Given my background as a software developer, it was an odd number of days when I was standing in Matrix-esque aisles pulling out cables in order to simulate failing hardware etc.

At the time I thought it was a little ridiculous, however, looking back perhaps just not quite as ridiculous as not doing it.

Gaz

-- Stop your grinnin' and drop your linen...they're everywhere!!!
Datagod-309892
Datagod-309892
Grasshopper
Grasshopper (13 reputation)Grasshopper (13 reputation)Grasshopper (13 reputation)Grasshopper (13 reputation)Grasshopper (13 reputation)Grasshopper (13 reputation)Grasshopper (13 reputation)Grasshopper (13 reputation)

Group: General Forum Members
Points: 13 Visits: 98
I went to Amazon's first big AWS conference with a client. We attended a session included a discussion on Netflix's chaos monkey. My client was so impressed that when we got back to Canada he announced that at least once a week the "Chaos Bear" would walk into the server room and unplug any undocumented hardware. This was terrifying, but at the same time motivated the staff to finish their documentation and properly label all the racks and racks of hardware.

An interesting consequence was that sometimes servers would be shut down and nobody would notice. Those servers were NOT turned back on, and did not survive the move to another data center.
Gary Varga
Gary Varga
SSChampion
SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)

Group: General Forum Members
Points: 10189 Visits: 6345
Datagod-309892 (3/3/2014)
I went to Amazon's first big AWS conference with a client. We attended a session included a discussion on Netflix's chaos monkey. My client was so impressed that when we got back to Canada he announced that at least once a week the "Chaos Bear" would walk into the server room and unplug any undocumented hardware. This was terrifying, but at the same time motivated the staff to finish their documentation and properly label all the racks and racks of hardware.

An interesting consequence was that sometimes servers would be shut down and nobody would notice. Those servers were NOT turned back on, and did not survive the move to another data center.


I wonder if the same technique can be applied to staff? Whistling

Gaz

-- Stop your grinnin' and drop your linen...they're everywhere!!!
EricEyster
EricEyster
Old Hand
Old Hand (312 reputation)Old Hand (312 reputation)Old Hand (312 reputation)Old Hand (312 reputation)Old Hand (312 reputation)Old Hand (312 reputation)Old Hand (312 reputation)Old Hand (312 reputation)

Group: General Forum Members
Points: 312 Visits: 520
It is remarkable when organization change from risk avoidance to risk preparedness. Many companies build systems (computer or otherwise) and hope the big issue never happens. Eventually, it does.

I was called out as crazy the first time I walked into our data center to perform “pull the plug” testing in the middle of the day. We found issues and needed a MS patch to be 100% ready to go. We had a few minutes of downtime during the test, but it would have been much longer had it occurred in the middle of the night with nobody ready to work the issue.
Tritoch
Tritoch
Forum Newbie
Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)

Group: General Forum Members
Points: 9 Visits: 67
We do everything we can to keep our applications online. We avoid patches...


I would much rather take a small amount of periodic planned downtime to patch than have a large amount of unplanned downtime when my unpatched systems get infected, compromised or crash due to a bug that should've been patched. Plus there are often optimizations that come from patching as well.
Gary Varga
Gary Varga
SSChampion
SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)

Group: General Forum Members
Points: 10189 Visits: 6345
Tritoch (3/3/2014)
We do everything we can to keep our applications online. We avoid patches...


I would much rather take a small amount of periodic planned downtime to patch than have a large amount of unplanned downtime when my unpatched systems get infected, compromised or crash due to a bug that should've been patched. Plus there are often optimizations that come from patching as well.


I think that there is a balance to be made. For in-house applications I would hope that the development team wouldn't have the impression that they could release on an ad-hoc basis. I know, and benefit from, the agile principle of releasing little and often but updates to systems interacting with others should not be taken lightly.

Gaz

-- Stop your grinnin' and drop your linen...they're everywhere!!!
EricEyster
EricEyster
Old Hand
Old Hand (312 reputation)Old Hand (312 reputation)Old Hand (312 reputation)Old Hand (312 reputation)Old Hand (312 reputation)Old Hand (312 reputation)Old Hand (312 reputation)Old Hand (312 reputation)

Group: General Forum Members
Points: 312 Visits: 520
Tritoch (3/3/2014)
We do everything we can to keep our applications online. We avoid patches...


I would much rather take a small amount of periodic planned downtime to patch than have a large amount of unplanned downtime when my unpatched systems get infected, compromised or crash due to a bug that should've been patched. Plus there are often optimizations that come from patching as well.


Agreed. One of the data centers I worked did not keep patches up to date. then Slammer was released. Building systems with the near-monthly patch reboot in mind goes hand in hand with other HA requirements.
Steve Jones
Steve Jones
SSC-Forever
SSC-Forever (40K reputation)SSC-Forever (40K reputation)SSC-Forever (40K reputation)SSC-Forever (40K reputation)SSC-Forever (40K reputation)SSC-Forever (40K reputation)SSC-Forever (40K reputation)SSC-Forever (40K reputation)

Group: Administrators
Points: 40695 Visits: 18851
Datagod-309892 (3/3/2014)
I went to Amazon's first big AWS conference with a client. We attended a session included a discussion on Netflix's chaos monkey. My client was so impressed that when we got back to Canada he announced that at least once a week the "Chaos Bear" would walk into the server room and unplug any undocumented hardware. This was terrifying, but at the same time motivated the staff to finish their documentation and properly label all the racks and racks of hardware.

An interesting consequence was that sometimes servers would be shut down and nobody would notice. Those servers were NOT turned back on, and did not survive the move to another data center.


Nice. I've always assumed lots of servers weren't necessarily being used, or at least not regularly, in many data centers. Be worth shutting down, or these days, p->v, and bringing them up when needed.

Follow me on Twitter: @way0utwest
Forum Etiquette: How to post data/code on a forum to get the best help
My Blog: www.voiceofthedba.com
Gary Varga
Gary Varga
SSChampion
SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)

Group: General Forum Members
Points: 10189 Visits: 6345
Steve's p->v would give me the security that I would require for such a strategy. Maybe I am just a little more risk averse.

Gaz

-- Stop your grinnin' and drop your linen...they're everywhere!!!
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search