RAID 1 vs. RAID 10

  • Lots of great articles on the intertubez about disk alignment and RAID configurations but I haven't found an answer (or a good way to test, and yes I know about SQLIO) for a simple scenario:

    Suppose I have 4 local physical disks available to me and I'm going to create an OLTP DB (for the sake of arguing let's say we're 50\50 on reads and writes, or otherwise average usage). Here are two scenarios that I would consider:

    1) Two RAID 1 drives. Create the DB with two data files, i.e. one data file on each drive

    2) One RAID 10 drive. Create the DB with one single data file

    In the RAID 1 scenario (as I understand it) SQL will round robin writes between the two data files, thus creating a software equivalent of striping. RAID 10, on the other hand, handles the striping at a hardware level. Those differences aside, I haven't found a good technical explanation - or numbers to back it - for what's happening under the covers that would make me believe one option is better than the other.

    So which of these two scenarios is more ideal and why?

    Kendal Van Dyke
    http://kendalvandyke.blogspot.com/[/url]

  • I would argue Raid 10 for one reason. You're not 100% sure of the usage and balance among the drives and with 2 R1s, you can run out of space on one, have space on the other. With one large R10, it gets handled and you get to use all the space.

    Other than that, I'm not sure there's a great technical argument for R1 v R10.

  • You should worry about your Log file before you add a second data file.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • RBarryYoung (12/23/2008)


    You should worry about your Log file before you add a second data file.

    Ummm....thanks....but kinda not the point of the post. I'm looking for technical reasons why a single data file on RAID 10 is better than two data files on two RAID 1's.

    But FWIW I'll stick my log files on a different drive than my data files altogether. :hehe:

    Kendal Van Dyke
    http://kendalvandyke.blogspot.com/[/url]

  • I doubt that there is much difference from a performance standpoint ... the raid 1 scenario would presumably require a tiny increase in system resources. Raid 10 would appear to be a little easier to manage in as much as you would be dealing with one file rather than two. If I remember correctly, fragmentation statistics aren't accurate for multiple files either.

    The above all assumes that the 2 files are across one filegroup (as that would use the proportional fill algorithm) - if one were to split the database into two filegroups (one file per filegroup) then there are other potential advantages from the raid 1 scenario .... filegroup backups, partitioning etc etc.

  • Another possible implication of two raid 1s would be if they were the same filegroup, with a massive table that was frequently being scanned. If I remember rightly in that situation (table scan, table on more than 1 file group) SQL Server will initiate multiple threads (one per file) and run the scans in parallel.

    Mike

  • kendal.vandyke (12/23/2008)


    I'm looking for technical reasons why a single data file on RAID 10 is better than two data files on two RAID 1's.

    Ah, I see, I misunderstood the intent of your question. Sorry...

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Ah, I see, I misunderstood the intent of your question. Sorry...

    No apologies necessary! 🙂

    Kendal Van Dyke
    http://kendalvandyke.blogspot.com/[/url]

  • i would say the RAID 10 array is best as it offers the best performance and fault tolerance combined into one package. The downside of RAID 10 is the disk cost (no of disks required)

    -----------------------------------------------------------------------------------------------------------

    "Ya can't make an omelette without breaking just a few eggs" 😉

  • i would say the RAID 10 array is best as it offers the best performance and fault tolerance combined into one package. The downside of RAID 10 is the disk cost (no of disks required)

    Maybe the question was misunderstood. I proposed two scenarios for how to configure 4 disks to hold data. A single RAID 10 with 4 disks is just as fault tolerant as two RAID 1 drives - each can lose 1 disk per pair. Likewise the disk cost is the same in the question I asked.

    As for performance, I'm looking for something solid to show that RAID 10 would be better than RAID 1 or vice versa. I was really hoping someone knew enough about what's going on under the covers (e.g. IO paths, threads, etc.) to make it clear.

    Kendal Van Dyke
    http://kendalvandyke.blogspot.com/[/url]

  • kendal.vandyke (12/24/2008)


    Maybe the question was misunderstood.

    not at all.

    RAID 10 will generally offer a performance boost over RAID1 due to the striping across mirrored sets. As SQL server writes to the data files are a fairly random affair this suites sql server just fine. As RbarryYoung pointed out the log file is more important as it is suffers sustained serial writes and so performance requirements are different here. It all depends on how many disks and controllers are used in the config too. Also do you have any baseline for the expected reads\writes, if not it may be worth obtaining some?

    -----------------------------------------------------------------------------------------------------------

    "Ya can't make an omelette without breaking just a few eggs" 😉

  • Perry Whittle (12/25/2008)


    kendal.vandyke (12/24/2008)


    Maybe the question was misunderstood.

    not at all.

    RAID 10 will generally offer a performance boost over RAID1 due to the striping across mirrored sets. As SQL server writes to the data files are a fairly random affair this suites sql server just fine. As RbarryYoung pointed out the log file is more important as it is suffers sustained serial writes and so performance requirements are different here. It all depends on how many disks and controllers are used in the config too. Also do you have any baseline for the expected reads\writes, if not it may be worth obtaining some?

    Assuming that the system was running, how would you go about systematically getting baselines for reads/writes?

  • jlp3630 (12/27/2008)


    Assuming that the system was running, how would you go about systematically getting baselines for reads/writes?

    as you say, depends if the system is already live. If your scoping new disks for a current server then you should already have baseline data for disk I\O anyway. If your scoping for a complete new system then the database vendor would be a good start (if a 3rd party app) or try it in a test lab if you have it.

    -----------------------------------------------------------------------------------------------------------

    "Ya can't make an omelette without breaking just a few eggs" 😉

  • Perry Whittle (12/28/2008)


    jlp3630 (12/27/2008)


    Assuming that the system was running, how would you go about systematically getting baselines for reads/writes?

    as you say, depends if the system is already live. If your scoping new disks for a current server then you should already have baseline data for disk I\O anyway. If your scoping for a complete new system then the database vendor would be a good start (if a 3rd party app) or try it in a test lab if you have it.

    Perry,

    Telling him that he should already have baseline metrics didn't really answer his question.

    jlp3630, you said that your system was already running...here are some ways you can gather baseline metrics:

    1) Set up a performance monitor counter log that writes to a SQL database. You can write ad-hoc queries to look at the raw data.

    2) Use a tool like SQLH2 to gather perfmon metrics on a schedule, then use the reports that come with the tool to look at the numbers the tool collected. You can also use your own queries directly against the SQLH2 repository if you don't like the pre-canned reports.

    3) Use a 3rd party tool like SQLSentry Performance Advisor or Idera SQL Diagnostic Manager[/url]

    #1 & #2 are free but don't have the "polish" of the 3rd party tools.

    Kendal Van Dyke
    http://kendalvandyke.blogspot.com/[/url]

  • I have a 20 gb database. I have 6 sas disk

    Configuration is:

    two works in span (Raid 1)

    another two are mirror of first two

    another two are global hot spares.

    I tested it and works like charm. when disk go out, controller takes hot spare and automatically rebuilds array.

Viewing 15 posts - 1 through 15 (of 56 total)

You must be logged in to reply to this topic. Login to reply