Moving tempDB from RAID10 to RAID5 for OLTP?

  • My data files and tempDB file (single temp file) share the same RAID10 LUN and our SAN controller is saturated/overloaded. When there is extreme load put on the LUN, the server OS will lose connection to the drives and SQL crashes.

    It crashes each time I run a db integrity check on my largest db even when there's minimal system activity. The OS lost connection to the SAN during production hours last week, under heavy load.

    Can't buy more spindles, but have an opportunity to re-allocate spindles to build a dedicated tempDB LUN, but there's only enough drives for a 250gb'ish LUN at RAID5.

    Based on my 365 day report (attached) comparing reads/writes/trans, does RAID5 seem safe for the tempDB? Overall the prod user db gets about 30% more write activity than temp, but temp peaks about 30% higher - rarely.

    Should I consider a smaller LUN at RAID10 for the tempDB? It RARELY exceeds 50gb's. My goal is to relieve some pressure on the large RAID10 data LUN by moving the tempDB.

    Would appreciate any thoughts or suggestions...

  • After I had an actual conversation with my SAN guy, it looks like I will be able to scrape up enough spindles to run RAID10 at 200gb's for the tempDB.

    I would feel more comfortable with a larger LUN, but we'll see how it goes...

  • Raid 10 is a better option if you have enough disk space.The write are done faster on raid 10 beacause unlike raid 5 there's no need to calculate the parity.So the overall system performance would be degraded if you choose Raid 5 for tempdb.

    Pooyan

  • pooyan_pdm (8/22/2012)


    Raid 10 is a better option if you have enough disk space.The write are done faster on raid 10 beacause unlike raid 5 there's no need to calculate the parity.So the overall system performance would be degraded if you choose Raid 5 for tempdb.

    This now depends on your equipment and software as much as anything else. RAID 5 has been better optimized, and current speed tests (particularly on big iron with huge caches) show negligible differences under the majority of circumstances. RAID 10 is still SAFER, as you can handle a double drive loss as long as it's not the same stripe, but the speed difference has significantly dwindled. The older your equipment/software, the more likely pooyan's point will be true, however.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • I guess a question on my mind at this point is why anyone would need a 200GB TempDB.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (8/22/2012)


    I guess a question on my mind at this point is why anyone would need a 200GB TempDB.

    It's unfortunately not hard to need it on an overly active system that's utterly out of scope and you're not allowed to fix it. Between table spools for sorts, RCSI, reports being built off the OLTP system, and any other number of things you can slam a TempDB FAR too easily off a half a tera database. I'm staring at one that does it daily.

    Vendor DB, of course. To the vendor's credit we ARE two versions behind and they've stated they never expected the system to handle our volume.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Some advice for the OP, if you can take down your SAN by slamming it too hard, you should consider reducing your hba queue depth to limit the number of outstanding io requests. you might take a hit on max throughput, but if the san can't handle it anyway, you're not losing much.

  • SpringTownDBA (8/22/2012)


    Some advice for the OP, if you can take down your SAN by slamming it too hard, you should consider reducing your hba queue depth to limit the number of outstanding io requests. you might take a hit on max throughput, but if the san can't handle it anyway, you're not losing much.

    Moved it to the new LUN, but I won't know if it helped until this weekend when I run something really I/O intensive.

    Will report back...

  • This seems to have remedied the problem. DB integ check finished without crashing for the first time in a long time.

  • Thank you for reporting back - did performance improve as well as the crash frequency being reduced?

    Why would you need a 200GB tempdb? Index creates and/or rebuilds on 86.2GB-1.2TB tables, of course!*

    Check your metrics on the SAN - is it heavily loaded in terms of IOPS, in terms of throughput, or both? If it's IOPS, you've done well to get new spindles also on the SAN. If it's throughput, then a better solution is to provide enough IOPS and space locally, which will give you the tempdb performance you need, as well as free up SAN throughput for the rest of your (and other) uses.

    Local SSD's in RAID1 or RAID5 are excellent for tempdb use with modern controllers - and yes, if you benchmark them, you'll see that RAID5 performance and RAID10 performance are pretty close, and if you're using that few spindle disks, either one with a handful of good SATA or SAS SSD's (3 100GB SSD's in RAID5, say) should trounce whatever performance you're getting now.

    *download.microsoft.com/download/D/9/4/D948F981-926E-40FA-A026-5BFCF076D9B9/TEMPDB_Capacity_Planning_Index.doc

  • Jeff Moden (8/22/2012)


    I guess a question on my mind at this point is why anyone would need a 200GB TempDB.

    That would be my first question

    Evil Kraig F (8/22/2012)To the vendor's credit we ARE two versions behind and they've stated they never expected the system to handle our volume.

    They always give this BS, it's a standard vendor response 😉

    -----------------------------------------------------------------------------------------------------------

    "Ya can't make an omelette without breaking just a few eggs" 😉

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply