RAID

  • Bob Cullen-434885 (1/26/2011)


    cengland0 (1/26/2011)Imagine if you lost one drive in the first spanned set of drives and one drive in the second spanned set of drives. All your data is then lost. This doesn't matter how many drives you have. All you need to do is lose one drive in both sets and it's all over.

    My point exactly. So why is one more tolerant than the other, according to the "right" answer?

    Because if your first set is mirrored instead of spanned and the second set (which should be an exact copy of the first set) is either spanned or mirrored (it doesn't matter), you can lose any two drives out of the four (four drives is the minimum for both 0+1 and 1+0) and still be able to recover your data.

    The above statement doesn't hold true if you span both your original set and the backup set.

    Raid 1+0 uses two drives for mirroring and then uses another set of two drives for spanning/striping the mirrored drives. Any two drives can crash and you can still recover data.

    Raid 0+1 uses two drives for spanning/striping and then two more drives that are a copy of the original 2 drives and spans/strips those too. If one drive from the first set and the second set fail (two drives total), you're going to have difficulty recovering your data.

  • James Lean (1/26/2011)


    In RAID 0+1 you have two stripe sets that are mirrored, therefore you could theoretically lose all of the disks from one of the stripe sets and still have the data available.

    In RAID 1+0 you have a number of 2-disk mirrors that are then striped together. Theoretically you could lose one disk from each mirrored pair and still have the data available.

    And obviously on the flip-side, you could lose one disk from each stripe set in RAID 0+1 and lose all the data; or you could lose both disks from one of the mirrored pairs in RAID 1+0 and lose all the data.

    So in each level you can lose anywhere from two to half the disks, so aren't they both as tolerant as each other?

    -----
    JL

  • James Lean (1/26/2011)

    And obviously on the flip-side, you could lose one disk from each stripe set in RAID 0+1 and lose all the data; or you could lose both disks from one of the mirrored pairs in RAID 1+0 and lose all the data.

    So in each level you can lose anywhere from two to half the disks, so aren't they both as tolerant as each other?

    Well, no, because the only way a two-disk failure could take out a RAID 1+0 set is if you lose both disks from a mirror. There are more possible ways a two-disk failure could take out RAID 0+1, so by the law of probabilities you're more likely to lose your array from a two-disk failure in that case.

  • James Lean (1/26/2011)


    James Lean (1/26/2011)


    In RAID 0+1 you have two stripe sets that are mirrored, therefore you could theoretically lose all of the disks from one of the stripe sets and still have the data available.

    In RAID 1+0 you have a number of 2-disk mirrors that are then striped together. Theoretically you could lose one disk from each mirrored pair and still have the data available.

    And obviously on the flip-side, you could lose one disk from each stripe set in RAID 0+1 and lose all the data; or you could lose both disks from one of the mirrored pairs in RAID 1+0 and lose all the data.

    So in each level you can lose anywhere from two to half the disks, so aren't they both as tolerant as each other?

    No, that's not correct. In Raid 1+0, you have three copies of the data. In Raid 0+1, you only have two copies of the data.

    Raid 1+0 has it's setup like this:

    [font="Courier New"]

    D1 D2 ---> D3 D4

    A A ---> A B

    B B ---> C D

    C C

    D D[/font]

    Raid 0+1

    [font="Courier New"]

    D1 D2 ---> D3 D4

    A B ---> A B

    C D ---> C D

    E F ---> E F

    G H ---> G H[/font]

    So, with Raid 1+0, you can have all the data in D1, D2, and (D3+D4).

    With Raid 0+1, you have all the data in (D1+D2) and (D3+D4).

    In Raid 1+0, you can lose D1 and D2 and still have (D3+D4) with all the data. Or, you can lose D2 and D3 and all the data is still on D1. Or you can lose D3 and D4 and the data is available on D1 and D2.

    In Raid 0+1, if you lost either (D1 and D3) or (D2 and D4), there isn't any set that contains all the data.

  • Nevermind.

  • paul.knibbs (1/26/2011)


    There are more possible ways a two-disk failure could take out RAID 0+1, so by the law of probabilities you're more likely to lose your array from a two-disk failure in that case.

    OK, I see that point. I guess, re-reading the question/explanation, that may have been what the question was trying to illustrate. Although I would say the last line of explanation is definitely wrong:

    RAID 0+1 cannot tolerate more than 2 disk failures.

    As I said before, it can, depending on which disks fail!

    -----
    JL

  • I got the question wrong... but I still think I'm right, based on the way the question was stated.

    If you have 8 disks:

    - Configure as 1+0, you can have failure by loosing two disks. But you can lose up to 4 and still be Ok, depending on the disks. (i.e. two disks from the same mirror, or one disk from each mirror set)

    - Configure as 0+1, you can have failure by loosing two disks. But you can lose up to 4 and still be Ok, depending on the disks. (i.e. one disk from each stripe set, up to all disks in one stripe set + one on the other).

    So, out of 8 disks, in a worst case scenario, 2 disks will bring you down. Best case, it will take 5. Like everything else, "it depends". Disk failure is generally random-ish in nature, you can't control where the failure will occur.

  • marklegosz (1/26/2011)


    - Configure as 0+1, you can have failure by loosing two disks. But you can lose up to 4 and still be Ok, depending on the disks. (i.e. one disk from each stripe set, up to all disks in one stripe set + one on the other).

    Incorrect. Your second sentence is wrong.

    Assume you have 8 disks, we'll number 1-8. The mirrors are:

    1 <-> 2

    3 <-> 4

    5 <-> 6

    7 <-> 8

    The stripe is across the four pairs ((1,2), (3,4), (5, 6), (7, 8)), but I can read from either item in the stripe. If I lose disks 1, 4, 5, 8, I still have one from each stripe. I would have live then 2, 3, 6, 7 and can read my data.

    However, if I have stripes first, then I have stripes like this:

    (1, 3, 5, 7)

    (2, 4, 6, 8)

    If I lose any disk in the stripe, the stripe is gone. So if I lose 1, then that stripe is gone and I can ONLY read from the second stripe. The OS is fine with this and I can continue. If I lose 8 as well, then all my data is gone. Theoretically I can lose 4 disks (1, 3, 5, 7) and be fine, but if I lose two from a separate stripes I'm gone.

    In the worst case of either one, you can lose your data with 2 disk losses, and potentially up to 4 disks with no losses. However the chances of a 3 disk failure losing all your data is larger with RAID 0+1 because of the lack of tolerance at the individual stripe level. Once I lose 1 disks from 0+1, then any loss from the other stripe kills all my data. However in 1+0, only a loss from the same mirror kills me, not any other disk.

  • To add to what Steve is saying about the probabilities--let's say one of the disks in that eight-disk array has already failed. With RAID 0+1 we lose all our data if any of the disks in the other stripe dies, which is a 4 in 7 chance--more than 50%! With the other arrangement we only lose all our data if the mirror of the failed disk dies, which is only a 1 in 7 chance (about 14%).

    Of course, if your data is that vital that you're seriously considering the chances of a second disk failing so soon after the first that you wouldn't have time to replace it, you'd have a hot spare anyway... 😉

  • cengland0 (1/26/2011)


    No, that's not correct. In Raid 1+0, you have three copies of the data. In Raid 0+1, you only have two copies of the data.

    Really? I have never heard that Raid 1+0 has a 66% overhead. In fact I am sure it is just a 50% overhead, so there are only two copies of the data.

    I think if you look at Steve's post he has a very good description of how the two work with 8 drives.

  • UMG Developer (1/26/2011)


    cengland0 (1/26/2011)


    No, that's not correct. In Raid 1+0, you have three copies of the data. In Raid 0+1, you only have two copies of the data.

    Really? I have never heard that Raid 1+0 has a 66% overhead. In fact I am sure it is just a 50% overhead, so there are only two copies of the data.

    I think if you look at Steve's post he has a very good description of how the two work with 8 drives.

    You are right. I've no idea where that strange stuff came from, but certainly not from any RAID 10 documentation. Steve's description is good. In fact the chance of a RAID 10 array losing data at second disc failure is always 1/N times the chance of RAID 0+1 doing so, assuming there are 2N discs altogether. That's a big improvement (a factor of 2) even for 4 discs, for large arrays it's even bigger.

    Tom

  • Note that disks still fail. It's why when data is really critical, you can't necessarily wait on a rebuild. A disk failure from the same batch of drives, could be followed shortly by another disk failure.

    Put the primary filegroup and log on triple mirroring in these cases.

  • Steve Jones - SSC Editor (1/26/2011)


    Note that disks still fail. It's why when data is really critical, you can't necessarily wait on a rebuild. A disk failure from the same batch of drives, could be followed shortly by another disk failure.

    Very true, many years ago we had a case where we put 6 new drives in a server and after about 6 months they started failing, and within 2 months all 6 had failed. They were in a RAID5 set, and lucky for us we got the failed drives replaced and re-built before the next one failed.

    Since then, when possible, I try to get drives from multiple sources to try to avoid them all being from one batch.

  • I would also be sure that in my mirror sets

    (1, 2)

    that these were not two drives from the same batch. Maybe not the same company.

  • Thanks for the question.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

Viewing 15 posts - 16 through 30 (of 35 total)

You must be logged in to reply to this topic. Login to reply