Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««12

SSD Lifetimes Expand / Collapse
Author
Message
Posted Sunday, November 25, 2012 3:01 PM
Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: 2 days ago @ 12:17 PM
Points: 3,081, Visits: 11,231
Jeff Moden (11/25/2012)
FusionIO has listed 2 million hours and 6 years write endurance for their cards, ...

Perhaps I'm reading something wrong but 6 years is only about 52,596 hours. 2 Million hours is more than 288 years so I doubt they've actually tested it for that.

The 6 years number for write endurance is almost believable (I'd like to see the actual tests, though) but the 2 million hours sounds more like an expected non-operatioal shelf life than anything else. Is this nothing more than smoke'n'mirrors advertising with big unprovable numbers to impress and entice the unwary user?


I think they quote most mechanical disks with a MTBF of 1 million+ hours. I have to assume that they get these numbers from the reported failure rate of large numbers of disks, but maybe they just make them up.



Post #1388418
Posted Monday, November 26, 2012 4:58 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Yesterday @ 9:45 PM
Points: 36,013, Visits: 30,300
Michael Valentine Jones (11/25/2012)
Jeff Moden (11/25/2012)
FusionIO has listed 2 million hours and 6 years write endurance for their cards, ...

Perhaps I'm reading something wrong but 6 years is only about 52,596 hours. 2 Million hours is more than 288 years so I doubt they've actually tested it for that.

The 6 years number for write endurance is almost believable (I'd like to see the actual tests, though) but the 2 million hours sounds more like an expected non-operatioal shelf life than anything else. Is this nothing more than smoke'n'mirrors advertising with big unprovable numbers to impress and entice the unwary user?


I think they quote most mechanical disks with a MTBF of 1 million+ hours. I have to assume that they get these numbers from the reported failure rate of large numbers of disks, but maybe they just make them up.


Yeah... I don't believe those numbers, either.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

"Change is inevitable. Change for the better is not." -- 04 August 2013
(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1388539
Posted Monday, November 26, 2012 5:45 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Yesterday @ 9:45 PM
Points: 36,013, Visits: 30,300
Michael Valentine Jones (11/25/2012)[hrHowever, if your application is actually writing 20TB per day, then IO is likely a bottleneck, so it may be worth having a $55,000 10 drive SSD array.


Hmmm.... it really goes against my inner data-troll but, instead of spending who knows how much on fixing and regression testing the crap code that's causing all of those writes, it might just be worth the hardware investment.

Part of the reason it goes against my inner data-troll is because I don't want to set the precedent that it's OK to write crap code just because the SSDs might be able to handle it especially since there's normally a fair amount of CPU time that goes along with the writes. CPUs cost a whole lot more than disks of any type if you consider licensing, etc, etc.

An example of what I'm talking about (and I realize this has nothing to do with writes) is that one of our applications was setup to read a 4.7 million row table using a DISTINCT... to return a 51 row menu.... every time the menu was used... and it's used about 1,000 times per hour. It was using more than 4 CPU hours per 8 hour day and well over a TB of reads (just for this one menu!!!). The fix for that was pretty easy and we got the total CPU time down to seconds and the number of reads down to a sliver.

My point is that with an SSD, people would be tempted to let something like that go but I suspect the CPU time used would still be there. It's also one of those pieces of code that was poorly written because the table was much smaller when this all started and the duration was satisfactory. Now, with SSDs, folks have an additional justification to pump out rather thoughtless code and because of the reduction in the duration caused by the increased I/O speed, an unwary DBA might not pick up on how much CPU time is being used for such a terribly simple thing.

As with any growing system, we have a whole lot of other code in the same boat and it's just going to get worse for the CPUs.

The writes are another thing caused simply by a lack of understanding of how to update data from one system to another. It involves mostly 2 large tables and I'm surprised it hasn't "cut a path" in the proverbial carpet of the current disk system. I think the related processes would be pretty tough on the life expectancy of the spots hit on an SSD for that. Even if I had SSDs that lasted a for a hundred years, the related unnecessarily high CPU usage is high enough to warrent fixing.

Heh... of course, the way around the problem of keeping developers from writing additional CPU performance challenged code is to continue to do what I've always done. Use the worst server available for the Dev box. No SSDs there!


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

"Change is inevitable. Change for the better is not." -- 04 August 2013
(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1388560
Posted Monday, November 26, 2012 8:27 AM
Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: 2 days ago @ 12:17 PM
Points: 3,081, Visits: 11,231
Jeff Moden (11/26/2012)
...The writes are another thing caused simply by a lack of understanding of how to update data from one system to another. It involves mostly 2 large tables and I'm surprised it hasn't "cut a path" in the proverbial carpet of the current disk system. I think the related processes would be pretty tough on the life expectancy of the spots hit on an SSD for that...

I believe that SSDs balance their writes so that data is actually written to a new spot each time in order to level the "write wear" over the entire memory area. Most have a significant over provisioning of memory cells in order to reach the stated lifetime.

I agree that most performance issue have their origins in bad code, and the problems should be addressed there. However, there are times, especially with vendor supplied software, where you just can't do anything about it, and an SSD may provide relief.

I had one a year ago where we had to use a large disk array to get the write performance we needed for an application that gathered network performance metrics for a large network. The database was actually fairly small, and I think it would have been more cost effective to use SSDs, but I just could not talk management into it.



Post #1388655
Posted Monday, November 26, 2012 10:05 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: Administrators
Last Login: Yesterday @ 4:29 PM
Points: 32,819, Visits: 14,965
Jeff Moden (11/26/2012)
Michael Valentine Jones (11/25/2012)


I think they quote most mechanical disks with a MTBF of 1 million+ hours. I have to assume that they get these numbers from the reported failure rate of large numbers of disks, but maybe they just make them up.


Yeah... I don't believe those numbers, either.


They're statistical extrapolations. There's some science, but "mean" is "mean", not likely or expected. It means that half fail later, which implies that yours might fail tomorrow.







Follow me on Twitter: @way0utwest

Forum Etiquette: How to post data/code on a forum to get the best help
Post #1388703
Posted Monday, November 26, 2012 11:57 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Yesterday @ 9:45 PM
Points: 36,013, Visits: 30,300
Steve Jones - SSC Editor (11/26/2012)
Jeff Moden (11/26/2012)
Michael Valentine Jones (11/25/2012)


I think they quote most mechanical disks with a MTBF of 1 million+ hours. I have to assume that they get these numbers from the reported failure rate of large numbers of disks, but maybe they just make them up.


Yeah... I don't believe those numbers, either.


They're statistical extrapolations. There's some science, but "mean" is "mean", not likely or expected. It means that half fail later, which implies that yours might fail tomorrow.


Precisely. I was in the service so I definitely know what MTBF stands for (Most Troubles Begin way more Frequently )... especially the ones that claim a million + hours.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

"Change is inevitable. Change for the better is not." -- 04 August 2013
(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1388762
Posted Monday, November 26, 2012 1:01 PM
Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: 2 days ago @ 12:17 PM
Points: 3,081, Visits: 11,231
Jeff Moden (11/26/2012)
Steve Jones - SSC Editor (11/26/2012)
Jeff Moden (11/26/2012)
Michael Valentine Jones (11/25/2012)


I think they quote most mechanical disks with a MTBF of 1 million+ hours. I have to assume that they get these numbers from the reported failure rate of large numbers of disks, but maybe they just make them up.


Yeah... I don't believe those numbers, either.


They're statistical extrapolations. There's some science, but "mean" is "mean", not likely or expected. It means that half fail later, which implies that yours might fail tomorrow.


Precisely. I was in the service so I definitely know what MTBF stands for (Most Troubles Begin way more Frequently )... especially the ones that claim a million + hours.


If they have enough drives in service, MTBF should not be hard to calculate:
5,000 drives in service * 8,760 hours/year at 100% duty cycle = 43,800,000 total drive operating hours/year, so 43 failures/year would give you 1,000,000 MTBF

Of couse, there may be a big difference in the failure rate of new drives vs. 6 year old drives, and they probably don't have any history for that.





Post #1388802
Posted Monday, November 26, 2012 1:51 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Yesterday @ 9:45 PM
Points: 36,013, Visits: 30,300
Michael Valentine Jones (11/26/2012)
Jeff Moden (11/26/2012)
Steve Jones - SSC Editor (11/26/2012)
Jeff Moden (11/26/2012)
Michael Valentine Jones (11/25/2012)


I think they quote most mechanical disks with a MTBF of 1 million+ hours. I have to assume that they get these numbers from the reported failure rate of large numbers of disks, but maybe they just make them up.


Yeah... I don't believe those numbers, either.


They're statistical extrapolations. There's some science, but "mean" is "mean", not likely or expected. It means that half fail later, which implies that yours might fail tomorrow.


Precisely. I was in the service so I definitely know what MTBF stands for (Most Troubles Begin way more Frequently )... especially the ones that claim a million + hours.


If they have enough drives in service, MTBF should not be hard to calculate:
5,000 drives in service * 8,760 hours/year at 100% duty cycle = 43,800,000 total drive operating hours/year, so 43 failures/year would give you 1,000,000 MTBF

Of couse, there may be a big difference in the failure rate of new drives vs. 6 year old drives, and they probably don't have any history for that.


Let us (or is that "lettuce") hope they're not tossing the salad with those kinds of calculations.

Still, that wouldn't be so bad... it's a 0.86% failure rate.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

"Change is inevitable. Change for the better is not." -- 04 August 2013
(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1388834
Posted Monday, November 26, 2012 2:19 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: Administrators
Last Login: Yesterday @ 4:29 PM
Points: 32,819, Visits: 14,965
It seems those SSDs are performing much better, and I do know some people that have gotten multiple years in SQL Servers with SSDs. Not a representative sample by any means, but they are better than some I heard about 3-4 years ago that measured the tempdb SSD lifetime in months.






Follow me on Twitter: @way0utwest

Forum Etiquette: How to post data/code on a forum to get the best help
Post #1388848
« Prev Topic | Next Topic »

Add to briefcase ««12

Permissions Expand / Collapse