The Cost of Database Downtime

Question

The Cost of Database Downtime

Steve Jones - SSC Editor

SSC Guru

Points: 734418
More actions
April 13, 2017 at 9:33 pm

#324362

Comments posted to this topic are about the item The Cost of Database Downtime

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply

Jeff Moden SSC Guru Points: 1003853 More actions · Answer 1

From the article:
However, if you're not in the direct business of selling something through your database platform directly to customers, you might not have a large cost.

Even if the web site isn't for "direct business of selling something", the cost can be much larger in the form of "collateral damage" in the eyes of current and potential new clients than a lot of people think. In other words, an unexpected, prolonged outage may give the company a "black eye" in the eye's of it's customers and they sometimes think that you're only as good as your last failure.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Steve Jones - SSC Editor SSC Guru Points: 734418 More actions · Answer 2

That certainly can be true, but so many businesses experience crashes and failures. I used to feel this way, but since almost everyone has an IT department that has unexplained failures, each organization has sympathy for the issues.

Does downtime cost Netflix? Sure, there's some cost. They may lose some people to Hulu or another service. Many people know that things happen, and while they grumble, they forget. If you go to the auto parts store and their point of sale system dies, and they can't process sales, do you give up on them? Do others, even dealers? Or do they understand that for this incident (hour, day, whatever), things will be slow as they write tickets by hand?

I think very few customers would think an entire business is untrustworthy in some way from a failure or two. If it happened repeatedly, there might be concern and business would leave, but a few isolated incidents are expected. My mechanic broke their lift and they were extremely slow for a few weeks. It happens.

What's good is that we in IT treat this stuff seriously and most people work hard to avoid constant downtime. What's not so good is management is willing to tolerate downtime (often) when they push software out the door without testing.

Eric M Russell SSC Guru Points: 125519 More actions · Answer 3

... There is a report for that database downtime averages $7900/minutedatabase downtime averages $7900/minute. My guess (if that's true), is that we have something other than a normal distribution. There are probably some outliers that cost crazy amounts of money if their database goes down, while for most of us, it's an inconvenience or a small cost in the short term. Across a longer period (weeks), downtime might shut down the company. ...

Even for the same organization and database, the cost of downtime is not a normal distribution. A solid hour of downtime on a Monday afternoon could have a much different impact (if any) on business operations than an hour of downtime on Sunday morning or versus an hour of cumulative downtime (loss of connectivity perhaps?) spread across a month in one or two minute intervals. It also depends on how fault tolerant your application and ETL is.

"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

Eric M Russell SSC Guru Points: 125519 More actions · Answer 4

Steve Jones - SSC Editor - Friday, April 14, 2017 6:40 AM
...
Does downtime cost Netflix? Sure, there's some cost. They may lose some people to Hulu or another service. Many people know that things happen, and while they grumble, they forget.
...

I'd guess that an hour of system outage for Netflix is roughly equivalent to a situation where a customer searches for a movie, can't find it on Netflix, but finds on Amazon Prime instead. If someone is in the mood for a movie but encounters an availability snag (whether technical or otherwise), it doesn't take much for them to explore other online options. At least for me, both Netflix and Hulu are inexpensive enough that I subscribe to both, and context switch between the two, because each provider tries to specialize in a specific type of content. For example, Netflix has the movies and Hulu has the TV series. However, if you have two companies providing a mostly undifferentiated product line (ie: public utilities), then maximum availability is critical.

"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

GeorgeCopeland SSCertifiable Points: 6997 More actions · Answer 5

The system I work on has about 1000 online users, they all make about $30/hr, 1 minute of downtime ~ $500 lost user productivity. Our managers should have this figure tattooed on the backs of their hands.

Steve Jones - SSC Editor SSC Guru Points: 734418 More actions · Answer 6

Eric M Russell - Friday, April 14, 2017 7:24 AM
Even for the same organization and database, the cost of downtime is not a normal distribution. A solid hour of downtime on a Monday afternoon could have a much different impact (if any) on business operations than an hour of downtime on Sunday morning or versus an hour of cumulative downtime (loss of connectivity perhaps?) spread across a month in one or two minute intervals. It also depends on how fault tolerant your application and ETL is.

Very true

Eric M Russell SSC Guru Points: 125519 More actions · Answer 7

GeorgeCopeland - Friday, April 14, 2017 8:01 AM
The system I work on has about 1000 online users, they all make about $30/hr, 1 minute of downtime ~ $500 lost user productivity. Our managers should have this figure tattooed on the backs of their hands.

1 minute of downtime ~ $500 cumulative loss in user productivity... So your managers should post it inside each bathroom stall, maybe even have a popup reminder each time they open their web browser.
😀

"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

Jeff Moden SSC Guru Points: 1003853 More actions · Answer 8

Steve Jones - SSC Editor - Friday, April 14, 2017 6:40 AM
I think very few customers would think an entire business is untrustworthy in some way from a failure or two.

Could be but, as an example, I can tell you that the comedy of errors that occurred at GitHub makes me think about not ever using their services.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Eric M Russell SSC Guru Points: 125519 More actions · Answer 9

Jeff Moden - Friday, April 14, 2017 2:09 PM
Steve Jones - SSC Editor - Friday, April 14, 2017 6:40 AM
I think very few customers would think an entire business is untrustworthy in some way from a failure or two.
Could be but, as an example, I can tell you that the comedy of errors that occurred at GitHub makes me think about not ever using their services.

Maybe I'm mistaken, but reading the story, it sounded like a one-man show on the back side of the database stage, which would be surprising considering all the thousands of clients and TB of data they must manage. Where I'm at, DBA(s) manage scheduling of backups, and Operations team manages archival of backups to bulk storage. There are a lot of eyes on the ball.

"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

Steve Jones - SSC Editor SSC Guru Points: 734418 More actions · Answer 10

Jeff Moden - Friday, April 14, 2017 2:09 PM
Could be but, as an example, I can tell you that the comedy of errors that occurred at GitHub makes me think about not ever using their services.

What problems at Github? There haven't been major issues I've seen. There was a hacking problem, but that seems to be par for any service. There are some outages, but none I'd consider major.

If you're talking about the backpup/restore headache from deleting a replication primary, that was Gitlab.

Jeff Moden SSC Guru Points: 1003853 More actions · Answer 11

Steve Jones - SSC Editor - Saturday, April 15, 2017 10:35 AM
Jeff Moden - Friday, April 14, 2017 2:09 PM
Could be but, as an example, I can tell you that the comedy of errors that occurred at GitHub makes me think about not ever using their services.
What problems at Github? There haven't been major issues I've seen. There was a hacking problem, but that seems to be par for any service. There are some outages, but none I'd consider major.
If you're talking about the backpup/restore headache from deleting a replication primary, that was Gitlab.

My apologies to everyone. You're correct, it was GitLab, not GitHub.

But that also brings another problem to bear when such mistakes (even inadvertent mistakes) occur. GITxxx has become a powerful bit of branding and a mistake in even an unrelated area, subdivision of a company, or even a totally different company can affect the whole brand even if the rest of the brand has little or no similarities especially if the consumers don't have a clue as to what the differences are (even like totally separate companies).

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)