Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


An Azure Outage


An Azure Outage

Author
Message
Steve Jones
Steve Jones
SSC-Forever
SSC-Forever (41K reputation)SSC-Forever (41K reputation)SSC-Forever (41K reputation)SSC-Forever (41K reputation)SSC-Forever (41K reputation)SSC-Forever (41K reputation)SSC-Forever (41K reputation)SSC-Forever (41K reputation)

Group: Administrators
Points: 41024 Visits: 18868
Comments posted to this topic are about the item An Azure Outage

Follow me on Twitter: @way0utwest
Forum Etiquette: How to post data/code on a forum to get the best help
My Blog: www.voiceofthedba.com
Orlando Colamatteo
Orlando Colamatteo
SSCrazy Eights
SSCrazy Eights (9.9K reputation)SSCrazy Eights (9.9K reputation)SSCrazy Eights (9.9K reputation)SSCrazy Eights (9.9K reputation)SSCrazy Eights (9.9K reputation)SSCrazy Eights (9.9K reputation)SSCrazy Eights (9.9K reputation)SSCrazy Eights (9.9K reputation)

Group: General Forum Members
Points: 9919 Visits: 14376
In my experience the weakest managers are the ones repeatedly leading a witch-hunt for this or that...terrible for morale and very anti-progress.

All moves have advantages and disadvantages. The cloud is not going to be right for all businesses, but I suspect it will be good for enough, for enough, to prove it is here to stay.

__________________________________________________________________________________________________
There are no special teachers of virtue, because virtue is taught by the whole community. --Plato
addieleman@outlook.com
addieleman@outlook.com
SSC Rookie
SSC Rookie (32 reputation)SSC Rookie (32 reputation)SSC Rookie (32 reputation)SSC Rookie (32 reputation)SSC Rookie (32 reputation)SSC Rookie (32 reputation)SSC Rookie (32 reputation)SSC Rookie (32 reputation)

Group: General Forum Members
Points: 32 Visits: 115
The last sentence: However I think lots of management might prefer in-house infrastructure for a simple reason: it gives them a specific neck to choke, and possibly replace, when things go wrong.

That may be true for some companies, but I've also seen the opposite: it's easier to put the blame on a third party because it looks like it frees managers from the duty to solve the problems. If SLA's are defined it's also easier to explain why you bash a service supplier or not.
paul.knibbs
paul.knibbs
SSCrazy
SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)

Group: General Forum Members
Points: 2124 Visits: 6223
I think the difference is, if one of your internal systems goes down for any reason, you're in control of getting it back up and running. If the "cloud" goes down, you're entirely in the hands of the company providing that service to restore your access, and this leaves you feeling a bit helpless. Plus, you kind of expect a company the size of Microsoft to have enough redundancy in place that you really shouldn't be getting 8-hour outages!
Phil Factor
Phil Factor
SSC Eights!
SSC Eights! (953 reputation)SSC Eights! (953 reputation)SSC Eights! (953 reputation)SSC Eights! (953 reputation)SSC Eights! (953 reputation)SSC Eights! (953 reputation)SSC Eights! (953 reputation)SSC Eights! (953 reputation)

Group: General Forum Members
Points: 953 Visits: 2953
Yes, it was a technical problem and we all have sympathy for these because we experience them, and are sometimes responsible for them. Azure has, in general, performed very well and this incident is uncharacteristic. For me, the problem was that Microsoft's marketing department had previously over-egged the pudding by talking up the resilience of Azure 'Always up, Always on'. If they'd been more circumspect, and said that, on balance, there would be outages in any cloud service but these would probably be fewer than you'd expect from your own in-house IT Infrastructure (the Azure SLA quotes 99.95% uptime) , then it wouldn't have caused so much of a story. With marketing material, any IT manager needs to know by how much to dilute the claims, and they're likely to add plenty more water after this incident. After all, the occurrence of a leap year is rather more predictable than an earthquake.


Best wishes,

Phil Factor
Simple Talk
Gary Varga
Gary Varga
SSChampion
SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)

Group: General Forum Members
Points: 10331 Visits: 6348
This reminds me off when I was a passenger in a car recently. The driver was distracted by something the other side of the road for a moment and noticed late that the road had started to bend. I confess I have done exactly the same. As a driver you have an "Oops!!!" moment whilst adjusting direction. As a passenger its more like "Aaagggghhhh...we're all gonna die!!!". Basically, the driver notices the error and works on correcting it safe in knowledge that all is under control whereas the passenger doesn't have any confidence until the adjustment is complete.

Anyone gone to pump the brakes whilst a passenger?

Gaz

-- Stop your grinnin' and drop your linen...they're everywhere!!!
paul s-306273
paul s-306273
SSCrazy
SSCrazy (2K reputation)SSCrazy (2K reputation)SSCrazy (2K reputation)SSCrazy (2K reputation)SSCrazy (2K reputation)SSCrazy (2K reputation)SSCrazy (2K reputation)SSCrazy (2K reputation)

Group: General Forum Members
Points: 2040 Visits: 1080
addieleman (3/14/2012)
The last sentence: However I think lots of management might prefer in-house infrastructure for a simple reason: it gives them a specific neck to choke, and possibly replace, when things go wrong.

That may be true for some companies, but I've also seen the opposite: it's easier to put the blame on a third party because it looks like it frees managers from the duty to solve the problems. If SLA's are defined it's also easier to explain why you bash a service supplier or not.


Quite right - where I work any incident has the phrase 'we are working with our 3rd party suppliers...'.

It's never OUR fault.
phegedusich
phegedusich
SSC-Enthusiastic
SSC-Enthusiastic (106 reputation)SSC-Enthusiastic (106 reputation)SSC-Enthusiastic (106 reputation)SSC-Enthusiastic (106 reputation)SSC-Enthusiastic (106 reputation)SSC-Enthusiastic (106 reputation)SSC-Enthusiastic (106 reputation)SSC-Enthusiastic (106 reputation)

Group: General Forum Members
Points: 106 Visits: 531
Phil Factor (3/14/2012): (the Azure SLA quotes 99.95% uptime).


So we shouldn't expect another outage for, oh, three years or so. Sounds good to me.

Redundant failover architecture should include the management tools, folks. I'm spouting because I don't know the nature of the problem or the technical solution, but hey, if the system were in-house, I'd know, wouldn't I?
Gary Varga
Gary Varga
SSChampion
SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)

Group: General Forum Members
Points: 10331 Visits: 6348
phegedusich (3/14/2012)
Phil Factor (3/14/2012): (the Azure SLA quotes 99.95% uptime).


So we shouldn't expect another outage for, oh, three years or so. Sounds good to me.

Redundant failover architecture should include the management tools, folks. I'm spouting because I don't know the nature of the problem or the technical solution, but hey, if the system were in-house, I'd know, wouldn't I?


Surely you would have to investigate before you knew anything beyond what was reported. Wouldn't you?

Gaz

-- Stop your grinnin' and drop your linen...they're everywhere!!!
Steve Jones
Steve Jones
SSC-Forever
SSC-Forever (41K reputation)SSC-Forever (41K reputation)SSC-Forever (41K reputation)SSC-Forever (41K reputation)SSC-Forever (41K reputation)SSC-Forever (41K reputation)SSC-Forever (41K reputation)SSC-Forever (41K reputation)

Group: Administrators
Points: 41024 Visits: 18868
If you read the update and root cause analysis, this wasn't a redundancy issue. It was caused by a software bug, one that couldn't be fixed by more hardware. Developers had to build a fix, test it, and deploy it. This resulted in substantial delays, as many of us should be able to understand.

However it also appears that MS wasn't as forthcoming initially, at least according to Gartner: http://blogs.gartner.com/kyle-hilgendorf/2012/03/09/azure-outage-customer-insights-a-week-later/

Apparently MS is offering credit for the day, which is something: http://www.zdnet.com/blog/microsoft/microsoft-to-provide-azure-users-with-33-percent-credit-for-february-outage/12154

Follow me on Twitter: @way0utwest
Forum Etiquette: How to post data/code on a forum to get the best help
My Blog: www.voiceofthedba.com
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search