A New World Record

  • Comments posted to this topic are about the item A New World Record

  • "Some of these shouldn't be allowed in my mind, like RAID 0 disks, which wouldn't be in a production system. "

    I don't understand this line. I've seen and implemented lots of RAID 0 disk arrays in production. This dates back to the mid-80's. In fact, I usually recommend RAID 10 whenever possible. It's the most expensive but highest performing option.

    "Beliefs" get in the way of learning.

  • Robert Frasca (3/4/2008)


    "Some of these shouldn't be allowed in my mind, like RAID 0 disks, which wouldn't be in a production system. "

    I don't understand this line. I've seen and implemented lots of RAID 0 disk arrays in production. This dates back to the mid-80's. In fact, I usually recommend RAID 10 whenever possible. It's the most expensive but highest performing option.

    Raid 0 is very different then Raid 10. Raid 0 has no redundancy, so if there is a failure on a single disk in the array, you will lose everything on the array. With Raid 10, each disk in the array is mirrored, so it is highly redundant.

    I don't think Raid 0 is suitable for anything except work files or other temporary data that can be easily recreated.

  • You're absolutely right that RAID 0 has no redundancy and if redundancy is a requirement then, like you, I wouldn't bother creating a RAID 0 array as failure could be catastrophic.

    I guess I misunderstood what he was saying. I thought he was implying that striping isn't used in production. If his point is, that by using RAID 0 arrays they are avoiding the implicit additional overhead of redundancy that production sites would use then, of course, he is correct.

    I think my Blood Caffeine Level (BCL) had dropped too low and I wasn't thinking clearly.:hehe:

    "Beliefs" get in the way of learning.

  • RAID 0, for production, including production workstation, is a bad idea. It's a huge gamble that's not worth it.

    Raid 5 is better for protection, R1 or R10 even better.

    I just think the systems should include some restrictions that fit with production systems. For example, you could add a trace flag that prevents the t-log from being used. Certainly would make things faster (less writes, etc), but it wouldn't be something you want to do.

  • On the point about tricks used in benchmarks such as disabling checkpoints, RAW volumes, and RAID 0 stripes, TPC-E sets a much higher bar than previous benchmarks (e.g., TPC-C). This is one of the reasons that Microsoft has switched to TPC-E with SQL Server 2008. (In fact, we aren't planning to publish any TPC-C results on SQL Server 2008.) TPC-E requires that checkpoints are included in the measurement interval. In the published NEC result (1,126 tpsE), there were 16 checkpoints in the two hour interval -- an average of one every 7.5 minutes. That's not too shabby for a 64-core server running flat out with half a terabyte of memory.

    Furthermore, TPC-E requires that all storage used for database content must be able to keep running without interruption if any single disk fails. This can be achieved in various ways (e.g., RAID-1, RAID-5, RAID-10), but does eliminate RAID-0. In addition to requiring fault tolerant storage, the benchmark requires that it is tested and verified as part of the independent audit. The results are documented in the Full Disclosure Report submitted to the TPC (and available to the public).

  • Hi Steve,

    We use a production style disk configuration for the WR;

    Some background information on the setup and findings:

    - SQL2008 BulkInserts can write large 256 KB IOs ;

    - the NTFS formatted clustersize is set to the maximum ; 64KB;

    Please note that each volume is now automatically track aligned in windows2008

    To make sure that we benefit from these large IO write size , the best disk configuration we found to handle 256 KB IOs (throughput vs latency ) is to use 17 metaluns;

    - Each metaluns consists of 4 luns,

    - Each lun has 2 disks in RAID-1;

    (so each disk will service 128 KB IOs.)

    This gives best latency vs throughput on the EMC Clariion CX3-80 with a single rack with 165 spindles (146GB /15krpm 4Gbit)

    - The Unisys ES7000/one server has 8 x 4Gbit HBA's configured with a prerelease of EMC Powerpath 5.2 installed

    In the SQL build 10.0.1300.04 we use,

    to have consistent 256KB write IOs landing on the disks:

    - use 64 Filegroups

    - 64 files spread accross 16 metaluns

    - 1 Metalun is 16 KB formatted for the 16KB IOs log writes

    (with a single Filegroup and lesser files the IO sizes varied between 64KB / 128 / 256KB producing more IOs to service)

    The data is read from Raid 5 (4+1) disks

    Best regards,

    Henk van der Valk

    Unisys Performance Centers

  • To make sure that we benefit from these large IO write size , the best disk configuration we found to handle 256 KB IOs (throughput vs latency ) is to use 17 metaluns;

    - Each metaluns consists of 4 luns,

    - Each lun has 2 disks in RAID-10;

    (so each disk will service 128 KB IOs.)

    This gives best latency vs throughput on the EMC Clariion CX3-80 with a single rack with 165 spindles (146GB /15krpm 4Gbit)

    I'm not sure I understand the math here. A RAID-10 configuration requires a minimum of 4 spindles, two for mirroring, two for striping. So if you have 4 luns * 4 disks that is 16 spindles per metalun. 17 metalun's * 16 disks = 265 spindles but you said you had a single rack with 165 spindles.

    Did I miss something?

    Thanks.

    "Beliefs" get in the way of learning.

  • hello Robert,

    correct, it's 4x Raid-1 ; (see attached picture )

  • Thanks for the note and good to know that you are doing work to get closer to a real world simulation. I thought that I had read in the notes this was R0, but glad that it's not. Are all systems R1 for TPC-E?

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply