Odd SQLIOSIM Results

  • We just bought a new, beefy server for a SQL Server 2008 database. We configured the drives in the following manner:

    Drive E: Raid 1 - Where SQL Server 2008 will be installed

    Drive F: Raid 1 - Where the logs will be

    Drive G: Raid 10 - Where the data files will be

    For comparison purposes, I ran the SQLIOSIM utility using the default settings except that I moved the mdf file to Drive G and I moved the ldf file to Drive F. Here are the results for the MDF file:

    Running Average IO Duration: 1 ms

    Number of Times IO Throttled: 17249

    IO Request Blocks: 305

    Total IO Time: 8510614

    The average IO duration is excellent, of course, as 2-4 ms is considered ideal. However, I can't understand why I'm seeing such an excessive number of times the IO is throttled. As a comparison, I ran the same configuration on our SQL Server 08 test server that isn't optimized at all (everything is runnining on a single drive) and I get the following results:

    Running Average IO Duration: 102 ms

    Number of Times IO Throttled: 7388

    IO Request Blocks: 31

    Total IO Time: 191176517

    All of the figures are far worse on this server, as expected, except for Number of Times IO Throttled...which is far better, for some reason.

    Any idea why I'm seeing such a high number on our new server? My understanding is that I should be seeing a far lower number. Is there a potential problem with our hardware or is something in our configuration skewing the results? Thanks in advance for any insight that you can provide.

  • Is the target IO duration the same in both cases?

    With the default settings, that metric means the number of times a response went over 100ms and it slowed down the frequency of I/O requests until the response times fell below the target again, but I think this parameter may be self-adjusting if the average IO duration is above the default target duration (which it is in the test environment).

    I'd suggest fixing the target to say 200ms on both, so that the longest average duration comfortably sits under the target duration to be able to compare the two results.

    There may be legitimate reasons for this as well however as that response time looks suspiciously low. The 1ms results could be being satisfied by a disk cache and when it can't satisfy the request through cache, it goes to disk (or when the write cache is full, it slows down further I/O until the cache has been flushed to disk), which would explain why there's a higher number.

  • Thanks for the reply, Howard. I followed your advice and bumped the target IO up to 200. I got pretty much the same results on my new server. The old server's numbers were worse. I'm still seeing the perplexing issue of all of my numbers for the new server being excellent except for the fairly abysmal "Number of Times IO Throttled." I have a feeling that something in my configuration is throwing off the numbers for the test, but I really don't have a thorough understanding of IO processes. I'm going to try running some other tools, like SQLIO, and see how those results turn out. Thanks for your help.

    (Your comment about the cache may be spot on...I'll look into that possibility)

Viewing 3 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply