Configuring virtual drive settings

  • Hello -

    Ok... I'll just come clean... I'm an IT guy playing in your realm - so forgive the newb-esque nature of this question.

    I'm setting up a new SQL server - at this point I'm just configuring the hardware and then handing it off to a DBA. Unfortunately, there's little guidance on how to best set the virtual drive policies for how the controllers manage the drives connected to them.

    Here's a breakdown of the environment:

    HDWR:

    IBM 3650 M3

    16 drives

    2, 5015 controllers

    OS:

    Windows Standard 2008 R2

    APP:

    SQL Server 2005 Standard

    DRIVE CONFIG:

    5 drive 'groups'

    Controller 1

    1- OS -- RAID 1

    2- TEMP -- RAID 1

    3- LOG -- RAID 10

    Controller 2

    4- DB -- RAID 10

    5- Back UP -- RAID 10

    The vast majority of the activity on this DB is reading with most of the writes being done when users are not hitting it.

    I’d like to know what the "best practices" approach is to the drive policy setup - as it applies to the various drive groups and the roles that each group will be supporting. More specifically as I consider the virtual drive option policies (Read, Write, I/O, Stripe Size..) and their alternatives (Read Ahead or Adaptive, Write Back or Thru, I/O Direct or cached), for each of the drive groups, what are the suggested settings - especially considering that SQL will control some of these factors (like caching)?

    So where do I set Read Ahead (on the DB?), and what about Write Back? I'm thinking I'll just take the default settings for the OS and TEMP - but what of the others? And if I set Read Ahead for one (like the DB) should match those settings on another (like the Temp)?

    Any direction would be appreciated.

    Thanks.

  • The IO is usually the bottle neck in a DB Server and people tend to ignore this when setting up DB Servers. There fore keep in mind that the more IO controllers you have, the better. We use a strip size of 256 with a allocation unit as 64k. We set up our read cache as 0% and write cache as 100%.

    We also have three controllers (Ours is HP, One build in P400 and two external drives with P800 controllers) There fore we are able to split up the Data file and log file in separate IO path and TempDb in another Path. Raid 10.

    You could use SQLIOSim and other tools to get the idea of how your server will behave under load.

    Just my 2 cents

    -Roy

  • I like Roy's advice. However if you are going to cache writes, you need to be sure you have battery backed controllers as an issue could cause you problems here. The DB server will assume that something confirmed from the controller, which could be stuck in cache, is hardened to disk. If that isn't true, and the power dies before something is written to disk, it could cause you inconsistent data on restart/recovery.

    I have typically gone with 0 write caching, complete write through to avoid issues.

    Also, backup doesn't necessarily need R10, depending on your size/perfomance issues. You could go with R1 or event R5

  • Gentlemen –

    Thanks for taking the time to respond – and thanks for the guidance.

    1) Both of the controllers have battery backups – so losses arising from corrupt caching has been covered.

    2) Regarding the comment of “strip” size – I’m assuming (so please correct me if I’m wrong) you meant the “stripe” size, as the stripe represents the length of the data segment across all of the drives and the strip represents the data residing on the individual drive as part of the total stripe. So if you had a strip of 256 on 4 drives, your stripe would be 1024. Is that what you meant? If I increase the stripe to 1024 my read performance goes up but only for sequential reads. If however there are multiple random reads, the larger stripe will impede performance.

    And that brings me to another question...

    As I understand it from those that are working with the db that will be migrated to this server, there are (as a result of design and history) many, many joins in this db – and as such, I'm told, that the reads are more random that sequential. If that is the case (the copious joins), does it truly become a more random than sequential read process? Again, I’m quite green to db design, so I’m trying to interpret/convey what I’m hearing, to how I should best setup the drive policies.

    3) I'm with you on the read / write configs for the DB… do these settings also apply to the Log, Temp and Backup?

    Thanks again.

    SB

  • Yes, I was talking about Stripe...

    For user DB, the reads are almost always non sequential. But for the Log files, it is sequential writes. Another thing that you could keep in mind is to have small capacity disks. Instead of using 300GB 15K disks, I would go for smaller capacity disks (72GB for example). The reason behind is that you get more spindles in the case of smaller capacity disks. Every little bit of improvements count in the case of an OLTP server.

    Another thing to keep in mind is to make your Pagefile not very big. You do not want paging to the disk. What is your RAM? Any reason why you are not using Datacenter instead of Standard?

    -Roy

  • Roy - Thanks a lot for the clarification on how those DB’s are read!!

    As far as spindle count, this is a 16 drive server and we’re filling it to capacity – at this point however, I’ll have to work with the drive capacities as shown:

    Controller 1

    1- OS -- RAID 1 --- 2 @ 146

    2- TEMP -- RAID 1 --- 2 @ 146

    3- LOG -- RAID 10 --- 4 @ 146

    Controller 2

    4- DB -- RAID 10 --- 4 @ 300

    5- Back UP -- RAID 10 --- 4 @ 300

    (All are SAS 10K drives)

    Good point on the Paging File. The server currently has 24GB of RAM and has the ability to go to 192GB. I’ll let the system manage the paging file initially, get some metrics, and then adjust it from there if needed.

    As far as the caching issue is concerned - I'm finding conflicting views. There is caching for I/O, then there is drive caching, and then there is the caching that SQL manages. If certain caching combinations are enabled it seems it might cause inefficiencies.

    So then, as I set up the RAID for the various virtual drive groups, where should drive caching be used, or not used?

    Unfortunately price was the gotcha on DataCenter. Also, there’s a feeling that we will be moving to SQL 2008 within the next year – so there was no sense blowing the cash on the upgrade.

    Thanks again for your time and attention – it’s greatly appreciated.

    Steve

Viewing 6 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply