Building a new SQL2008 server, would like your input

  • So I have been tasked with building a replacement SQL server for our very old SQL2000 based server. The current server only has a few hard drives (no RAID, 3 drives, 1 for OS, 1 for log and one for data), 2GB of memory, using dual Xeon CPU's. Since this is a massive point of failure, I want to replace this server as soon as possible with a 2008 based configuration. While I am not 100% set yet on the server itself (thinking about a HP DL180 or better, with dual power supplies), I want to focus on the drives.

    I am not a SQL administrator, I am just the IT guy, but want to get the drive layout right. This is what I have in mind:

    RAID1 array (OS + SQL software + misc. data) - 2 drives

    RAID5 array (SQL data) - 4 drives

    RAID10/5 array (SQL log files) - 4 drives

    RAID1 array (SQL tempdb, not sure about the best RAID solution for this volume) - 2 drives

    Since I am trying to get a system supporting 12 2.5" 10,000RPM SAS drives (either 146GB or 300GB, depending on price), what laying would you recommend? Currently the DB is around 40GB.

    As for memory, I am looking at 8GB or more.

    Are there any other gotcha's I should worry about?

    I did do some research, and read through some older articles, but wanted to make sure I don't miss any major issues. Thanks!

  • Do you really need 4 drives devoted for logs? With only 12 drives to work with that seem prohibitively expensive. Logs are mainly sequential writes. Even a single 7200 RPM drive has good sequential write speeds. Unless your logs are excessively large I would consider 2 drive RAID 1 for logs. Take the two extra drives to improve spindle count on main data partition.

    Your call but I wouldn't put tempdb on it's own array if all you can spare in 2 drives. Splitting off tempdb really only makes sense when you can give it room to run. For example on our 36 spindle array we have 6 drives (RAID 10) for tempdb.

    So I would also drop tempdb array and use the 4 drives (2 from logs and 2 from tempdb) to create 4 drive RAID10 array for indexes. The only thing to check is will 2x physical size of 1 drive be sufficient for your indexes.

    Something like:

    2 drive RAID1 - OS/SQL

    2 drive RAID1 - Logs

    4 drive RAID5 - Data

    4 drive RAID10 - Indexes

    Keeping logs and nothing but logs on its own array will keep that IO pattern almost completely sequential writes (DB simply writes new log entries to the end of the log). Now one caveat. Somethings will read the logs and possibly in random manner. Log backups, database mirroring, and replication. This will degrade performance. If you aren't doing mirroring or replication it is just academic. Ensure log backups are down during downtime or under low load. If sever will have 24/7 high load then likely 4 drive RAID10 will be needed for logs.

    Might be out of your budget but you may want to look at Server with 2 drives mounted internally (for OS, programs, SQL Server) and a storage array of 12+ drives. Not sure about other vendors but Dell has MD1200 and MD1220. For some increased cost (H700 external RAID controller card) you gain some flexibility and the ability to easily grow storage in future. For example with 14 drives (2 internal + 12 drive storage array) I would add 1 drive to Data array and leave one a hotspare. Later as the DB requirements grow you could add additional arrays (i.e. bump logs up to 4 drive RAID10). Add a second Data array, etc.

    As far as 146GB vs 300GB. 2x 146B is always better than 1x 300GB so go with the highest number of SPINDLES your budget supports. Always pick smaller, more numerous drives. More spindles = more performance, and more flexibility. Given your existing DB is only 40GB (assuming that is data, indexes, and logs) even 73GB drives would work to save cost.

  • The above is a good place to start, but one key thing that has been over-looked is the TempDB.

    Is the TempDB heavily used? It could be worth pushing that off to a separate drive set.

    Do you know where the current problems/pressures are coming from? What are the bottlenecks?

    For just a standard SQL Server, I would go for:

    C - 2 disk - RAID1 - OS

    E - 2 disk - RAID1 - Logs

    F - 6 disk - RAID10 - Data & Indexes

    T - 2 disk - RAID1 - TempDB (this depends on the tempDB usage)

  • Thanks for the quick and detailed responses!

    tempdb is pretty small (1GB), and currently sits on the OS partition, so I guess it couldn't get much worse performance wise. I will definitely stick with 12 spindles (unless I can find a better configuration which keeps the server close to $10,000).

    It is a 24/7 business, but the load shouldn't be that high on this new system, mostly backups and reports (which does have a significant impact on DB performance on the current server) during the night.

    One of the Sales engineers at a major hardware company I am getting a quote from also indicated that RAID10 and SQL2008 is a bad idea, but I am having trouble confirming this.

  • I'm not qualified to comment really but I'd like to add a question on topic.

    After reading about spindle count etc I'm surprised there is no mention of SSD drives. Are they no good for the job? I imagined they'd be mental fast in a raid but I've never been there.

  • As a customer I would demand any hardware vendor indicating that RAID 10 is not viable to back that claim up with verifiable benchmarks.

    Here is a benchmark done by a 3rd party for dell. The first dozen pages are all RAH RAH Dell beats HP but go to the end and you can see setup and raw scores. The HP P410i SAS RAID controller achieved 1561 IOPS in 6 drive RAID5 configuration and 2659 in 6 drive RAID10 config.

    http://www.dell.com/downloads/global/products/pvaul/en/Dell6Gbps-vs-HP6Gbps.pdf

    The main drawback to RAID10 is the "cost" in terms of redundant drives. In 6 drive RAID5 config you get 5x a single drive capacity in usable space. In 6 drive RAID10 you get only 3x the single drive capacity in usable space. Still based on number of spindles you are looking at and the current DB size that would appear to be a non-issue.

    Still either RAID5 or RAID10 is fine. The point is although in theory it would be best to have 6+ arrays (OS, data1, data2, index, tempdb, logs) with only 12 drives to work with you need to make some comrpomises. Having 4 drives for logs is "expensive" (1/3 of your spindles right there). Having a seperate tempdb array can actually CREATE a bottleneck. Simplistic view: If data array has 6 spindles working for it and tempdb only has 2 you are crippling your data array throughput because transfers to tempdb will be constrained by the significantly reduced throughput of a pair of mirrored disks.

    Personally I would:

    a) Use only 2 drives (RAID 1) for logs. While more drives is always nicer, anything more is just too expensive when you only have 12 (10 after OS pair) drives to work with. Just don't put them on the OS drives you want logs on their own array and nothing else on it.

    b) Keep tempdb on data array and make it either RAID10 or RAID5.

    c) Consider making a seperate index array and make it RAID 10.

    One last thought:

    If your goal is to keep the system under $10K and your current DB is only 40GB you may want to consider going with fewer and lower capacity drives, take those funds and double your RAM. RAM is insanely cheap today and SQL Server can always use more RAM (well within reason).

  • Revised, as you dont seem to use TempDB much:

    C - 2 disk - RAID1 - OS

    E - 2 disk - RAID1 - Logs

    F - 4 disk - RAID10 - Data

    F - 4 disk - RAID10 - Indexes

    Well something along those lines...

    But definitely agree with the more RAM, get the fastest disks you can, but if you and not limited by space, keep the sizes down.

    EDIT - dont agree with your sales guy, as stated, RAID10 is costly because of the number of redundant disks

  • I currently have my choices narrowed down to a Dell R510 server which supports 12+2 (internal bays) drives, and HP DL370 G6, which supports 16 drives.

    This is the link he sent me regarding the RAID5 vs RAID10:

    http://weblogs.sqlteam.com/billg/archive/2007/06/18/RAID-10-vs.-RAID-5-Performance.aspx

    PS: Thanks again for the detailed responses, they are extremely helpful!

  • So since I will have 2 internal drives for the OS, in a RAID1 configuration, I have 12 drives to play with for just the MSSQL stuff. What do you guys think about RAID6 for the DATA array? I understand it is a little slower than RAID5, but the DATA array wouldn't wouldn't be as busy as the LOG array, would it? Or should I use the extra slots for hot spares?

  • Definitely would love your input again, since I have 12 drives for just MSSQL data now (OS is on the internal RAID1 array), vs 10 originally. Based on previous comments, this is the layout I am considering:

    2 disk - RAID1 - OS

    4 disk - RAID10 - Logs

    4 disk - RAID10 - Data

    4 disk - RAID10 - Indexes

    What do you guys think?

  • Busy is all relative. Log is sequential writes which any drive can do without breaking a sweat.

    Data IO patterns will vary but is almost always random. The "average" (and there is no good average) would be 60% random reads, 30% random writes, and 10% sequential reads/writes.

    RAID5 has an overhead when it comes to random writes. Raid 6 further increases that overhead.

    With 12 drives dedicated for SQL I would look at something like this as a starting point.

    Data - 5 drives RAID 5 (or RAID10)

    Index - 4 drives RAID10 (If you need more space RAID5 is an option).

    Log - 2 drives RAID1

    Hotspare - 1 drive

    You can go with RAID6 instead of RAID5 with hotspare but you will pay for it with slower random writes than RAID5 (which already has slower random writes than RAID10). The computations for RAID6 is much more complex than RAID5 and that will be felt in IOPS when it comes to random writes.

    About logs.

    While logs can be a bottleneck IF you keep logs on their own partition (with absolutely nothing else on them) you can keep I/O pretty much sequential. Any decent SAS drive is going to push 100 to 120 MB/s sequential writes more than enough for most DB. The key is to ensure NOTHING, absolutely NOTHING uses log array except the logs. Adding even small amount of other I/O can bog the array down and create a bottleneck. If you feel you must use more than 2 spindles for logs then go 4 drive RAID10. Logs are relatively small and write intensive RAID 5 or 6 offers no advantages and lots of disadvantages.

    About tempdb.

    It doesn't make sense to split tempdb off unless you can give it enough spindles to run. Personally I have never seen a situation that it makes sense to split off tempdb unless you got at least 6 spindles to spare (for RAID10).

  • Your post seems to indicate that going with 3 RAID10 arrays is not that bad of an idea (especially after comparing performance/recovery time between RAID5/RAID10). I do plan on keeping the LOG array just for logs, and want to do the same for data and indexes (putting the tempdb on the OS RAID1 mirror).

    Thanks again for the response!

  • I hadn't seen your earlier post when I commented above ...

    Based on previous comments, this is the layout I am considering:

    2 disk - RAID1 - OS

    4 disk - RAID10 - Logs

    4 disk - RAID10 - Data

    4 disk - RAID10 - Indexes

    Looks good to me. You may not need 4 drives for logs but it certainly won't hurt. That setup is a good conservative choice where space is not a huge issue and you can "afford" the disk premium of RAID10. You get high throughput, fast recovery, and no write overhead.

    If you put tempdb on OS mirror I would test it extensively, otherwise you can cripple the throughput of overall server. SQL Server will assume the tempdb is roughly same perfromance as data drive. If you have 4 drives in RAID10 for data and only 2drive RAID shared w/ OS for tempdb that assumption will be invalid. Remember the SQL OS will use tempdb for temporary tables to hold intermediate values when generating execution plans even if user query doesn't specifically use any temporary tables. If the tempdb is on a fast array this can improve performance as both data array and temp array are working together to maximize throughput but if the tempdb is on a slow array it will degrade performance. Worse it wil act as a bottleneck as your fast data array will sit idle waiting for data to be slowly passed in and out of tempdb. No two DB are alike but 99% of the time unless you plan on devoting real resources to dedicate tempdb (at least as many spindles as your primary data partition) it is unlikely that splitting off tempdb from data will improve performance. It may actually significantly hinder performance. If you have enough RAM an alternative is to put tempdb simply in RAM.

  • The server has 24GB of memory. What about removing the RAID10 array for Logs, and replace it with 1 RAID1 array for Logs and 1 RAID1 array for tempdb? Or should I keep the Logs array, and just put tempdb on either the Data or Indexes array? I am mostly concerned about hardware failure, performance wise, even a single disk in this new machie would be faster than the existing server (which uses 7200RPM IDE drives).

    2 disk - RAID1 - OS

    2 disk - RAID1 - Logs

    2 disk - RAID1 - TempDB

    4 disk - RAID10 - Data

    4 disk - RAID10 - Indexes

    or

    2 disk - RAID1 - OS

    4 disk - RAID10 - Logs

    4 disk - RAID10 - Data (/TempDB?)

    4 disk - RAID10 - Indexes (/TempDB?)

  • RAID10 requires an even number of drives.

    The best solution with 12 drives available would possibly be

    OS RAID1 (2 drives)

    Logs RAID1 (2 drives)

    Data RAID10 (4 drives)

    Backup RAID1 (2 drives)

    Tempdb RAID1 (2 drives)

    -----------------------------------------------------------------------------------------------------------

    "Ya can't make an omelette without breaking just a few eggs" 😉

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply