Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 12»»

Downtime Expand / Collapse
Author
Message
Posted Saturday, March 1, 2014 11:28 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: Administrators
Last Login: Today @ 4:10 PM
Points: 33,095, Visits: 15,202
Comments posted to this topic are about the item Downtime






Follow me on Twitter: @way0utwest

Forum Etiquette: How to post data/code on a forum to get the best help
Post #1546656
Posted Monday, March 3, 2014 2:38 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Today @ 9:07 AM
Points: 5,200, Visits: 2,831
I once had a role commissioning a set of servers and associated hardware in, at the time, what was Europe's biggest data centre. The servers housed various internet and intranet offerings that were critical to the company. Given my background as a software developer, it was an odd number of days when I was standing in Matrix-esque aisles pulling out cables in order to simulate failing hardware etc.

At the time I thought it was a little ridiculous, however, looking back perhaps just not quite as ridiculous as not doing it.


Gaz

-- Stop your grinnin' and drop your linen...they're everywhere!!!
Post #1546825
Posted Monday, March 3, 2014 5:13 AM


Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Friday, July 4, 2014 6:53 AM
Points: 13, Visits: 86
I went to Amazon's first big AWS conference with a client. We attended a session included a discussion on Netflix's chaos monkey. My client was so impressed that when we got back to Canada he announced that at least once a week the "Chaos Bear" would walk into the server room and unplug any undocumented hardware. This was terrifying, but at the same time motivated the staff to finish their documentation and properly label all the racks and racks of hardware.

An interesting consequence was that sometimes servers would be shut down and nobody would notice. Those servers were NOT turned back on, and did not survive the move to another data center.
Post #1546858
Posted Monday, March 3, 2014 6:01 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Today @ 9:07 AM
Points: 5,200, Visits: 2,831
Datagod-309892 (3/3/2014)
I went to Amazon's first big AWS conference with a client. We attended a session included a discussion on Netflix's chaos monkey. My client was so impressed that when we got back to Canada he announced that at least once a week the "Chaos Bear" would walk into the server room and unplug any undocumented hardware. This was terrifying, but at the same time motivated the staff to finish their documentation and properly label all the racks and racks of hardware.

An interesting consequence was that sometimes servers would be shut down and nobody would notice. Those servers were NOT turned back on, and did not survive the move to another data center.


I wonder if the same technique can be applied to staff?


Gaz

-- Stop your grinnin' and drop your linen...they're everywhere!!!
Post #1546877
Posted Monday, March 3, 2014 6:14 AM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Monday, May 5, 2014 6:31 AM
Points: 291, Visits: 519
It is remarkable when organization change from risk avoidance to risk preparedness. Many companies build systems (computer or otherwise) and hope the big issue never happens. Eventually, it does.

I was called out as crazy the first time I walked into our data center to perform “pull the plug” testing in the middle of the day. We found issues and needed a MS patch to be 100% ready to go. We had a few minutes of downtime during the test, but it would have been much longer had it occurred in the middle of the night with nobody ready to work the issue.
Post #1546887
Posted Monday, March 3, 2014 7:11 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Wednesday, May 7, 2014 7:04 AM
Points: 7, Visits: 59
We do everything we can to keep our applications online. We avoid patches...


I would much rather take a small amount of periodic planned downtime to patch than have a large amount of unplanned downtime when my unpatched systems get infected, compromised or crash due to a bug that should've been patched. Plus there are often optimizations that come from patching as well.
Post #1546916
Posted Monday, March 3, 2014 7:23 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Today @ 9:07 AM
Points: 5,200, Visits: 2,831
Tritoch (3/3/2014)
We do everything we can to keep our applications online. We avoid patches...


I would much rather take a small amount of periodic planned downtime to patch than have a large amount of unplanned downtime when my unpatched systems get infected, compromised or crash due to a bug that should've been patched. Plus there are often optimizations that come from patching as well.


I think that there is a balance to be made. For in-house applications I would hope that the development team wouldn't have the impression that they could release on an ad-hoc basis. I know, and benefit from, the agile principle of releasing little and often but updates to systems interacting with others should not be taken lightly.


Gaz

-- Stop your grinnin' and drop your linen...they're everywhere!!!
Post #1546924
Posted Monday, March 3, 2014 7:29 AM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Monday, May 5, 2014 6:31 AM
Points: 291, Visits: 519
Tritoch (3/3/2014)
We do everything we can to keep our applications online. We avoid patches...


I would much rather take a small amount of periodic planned downtime to patch than have a large amount of unplanned downtime when my unpatched systems get infected, compromised or crash due to a bug that should've been patched. Plus there are often optimizations that come from patching as well.


Agreed. One of the data centers I worked did not keep patches up to date. then Slammer was released. Building systems with the near-monthly patch reboot in mind goes hand in hand with other HA requirements.


Post #1546926
Posted Monday, March 3, 2014 8:50 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: Administrators
Last Login: Today @ 4:10 PM
Points: 33,095, Visits: 15,202
Datagod-309892 (3/3/2014)
I went to Amazon's first big AWS conference with a client. We attended a session included a discussion on Netflix's chaos monkey. My client was so impressed that when we got back to Canada he announced that at least once a week the "Chaos Bear" would walk into the server room and unplug any undocumented hardware. This was terrifying, but at the same time motivated the staff to finish their documentation and properly label all the racks and racks of hardware.

An interesting consequence was that sometimes servers would be shut down and nobody would notice. Those servers were NOT turned back on, and did not survive the move to another data center.


Nice. I've always assumed lots of servers weren't necessarily being used, or at least not regularly, in many data centers. Be worth shutting down, or these days, p->v, and bringing them up when needed.







Follow me on Twitter: @way0utwest

Forum Etiquette: How to post data/code on a forum to get the best help
Post #1546962
Posted Monday, March 3, 2014 9:05 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Today @ 9:07 AM
Points: 5,200, Visits: 2,831
Steve's p->v would give me the security that I would require for such a strategy. Maybe I am just a little more risk averse.

Gaz

-- Stop your grinnin' and drop your linen...they're everywhere!!!
Post #1546973
« Prev Topic | Next Topic »

Add to briefcase 12»»

Permissions Expand / Collapse