Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 12»»

Help interpret SQLIO result Expand / Collapse
Author
Message
Posted Thursday, April 18, 2013 11:59 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Friday, June 14, 2013 1:40 PM
Points: 18, Visits: 63
Im trying to either prove or disprove that the slow front-end (web ui) performance is, in some part, attributed to disk bottleneck. I've run SQLIO to gather some metrics, but they'll always be just numbers without context. For example:


c:\Program Files (x86)\SQLIO>sqlio -kR -t4 -s120 -o64 -frandom -b64 -LS -Fparam.txt
sqlio v1.5.SG
using system counter for latency timings, 3579545 counts per second
parameter file used: param.txt
file e:\sqlio\testfile.dat with 2 threads (0-1) using mask 0x0 (0)
2 threads reading for 120 secs from file e:\sqlio\testfile.dat
using 64KB random IOs
enabling multiple I/Os per thread with 64 outstanding
using specified size: 100 MB for file: e:\sqlio\testfile.dat
initialization done
CUMULATIVE DATA:
throughput metrics:
IOs/sec: 1649.36
MBs/sec: 103.08
latency metrics:
Min_Latency(ms): 4
Avg_Latency(ms): 77
Max_Latency(ms): 779

How can I provide context to the above to state definitely that the disk subsystem is underperforming to support a multi client UI tool. All of the articles I've read online only seem to use SQLIO as a tuning tool (optimize within existing hardware scope), not as a tool to determine whether the architecture is suitable or not.

Anyone have any ideas? Am I using the right tool? How can I either make this case or move onto something else?

Thanks in advance.
Post #1444061
Posted Thursday, April 18, 2013 1:54 PM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Thursday, December 18, 2014 2:59 AM
Points: 1,356, Visits: 1,180
Have you used performance monitor on the server?

Specifically Physical Disk AVG SEC/READ & SEC/WRITE. Any values over 20ms indicate a problem with disk IO.

Andrew
Post #1444126
Posted Thursday, April 18, 2013 1:59 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Friday, June 14, 2013 1:40 PM
Points: 18, Visits: 63
Can anyone provide any feedback on this or please point me in the right direction? Im not sure if I've given enough detail or if this is a question that's been asked too many times around here.

Thanks again.
Post #1444129
Posted Thursday, April 18, 2013 2:03 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Friday, June 14, 2013 1:40 PM
Points: 18, Visits: 63
THanks for responding Andrew. Yes, I have. In fact I did that first. Yes, Avg Disk sec/Read is ~.030, which I believe is 30ms.
Post #1444131
Posted Thursday, April 18, 2013 2:07 PM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Thursday, December 18, 2014 2:59 AM
Points: 1,356, Visits: 1,180
OK, what's your disk system, DAS or SAN?

Andrew
Post #1444134
Posted Thursday, April 18, 2013 2:56 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Friday, June 14, 2013 1:40 PM
Points: 18, Visits: 63
DAS, RAID-5. Unfortunately, our server is installed an a virtual host. 32GB RAM allocated. I can't speak to what other hosts are sharing the same physical storage.
Post #1444151
Posted Thursday, April 18, 2013 3:22 PM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Thursday, December 18, 2014 2:59 AM
Points: 1,356, Visits: 1,180
Alright, as you have stats showing that there is latency from the disk system I would try and prove it.

You're right when you initially said that figures showing latency are hard to completely confirm that the disk system is the bottleneck. You don't have a baseline so it is impossible to determine that that is the problem (the latency from the disk could have always been at those figures and something else is causing the problem).

I would try and prove it by running a query that would retrieve a large number of records from the database.

Initially (if the table has not been queried recently), SQL Server would have to retrieve the data from disk and copy into memory. Once that has been done, subsequent executions of that query would return quickly. This should prove that as the initial execution was longer, the disk system is the bottleneck.

Should be quite a good way of demonstrating the problem by showing the run times of the first query against subsequent queries.

If your system has many different queries executed against it, this would be a good demonstration of where the problem is. However if you have multiple executions of the same query, the problem lies elsewhere.

I would caution against running this on your live system as it would affect performance.

What do you think?

Andrew
Post #1444169
Posted Thursday, April 18, 2013 3:28 PM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Thursday, December 18, 2014 2:59 AM
Points: 1,356, Visits: 1,180
Sorry, forgot to say that when you run the query have a look at Activity Monitor for IO_COMPLETION wait types
Post #1444173
Posted Friday, April 19, 2013 7:45 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Friday, June 14, 2013 1:40 PM
Points: 18, Visits: 63
This sounds good and is kinda what I've been doing. Do you know of any MS recommendation white papers that I can use to compare my latency results against to present to non-technical individuals? That would really go a long way to kinda prove that we're trying to get blood from a stone.

Thanks.
Post #1444404
Posted Friday, April 19, 2013 8:36 AM
Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Today @ 12:07 AM
Points: 3,113, Visits: 11,547
Your SQLIO test file size is far too small to be a valid test. I prefer to make it something close to the size of the database or at least 20 GB. Run a short throwaway test to create the test file, and use it for the remaining tests. This makes sure you are reading/writing data from/to the disk, instead of the disk controller cache.

Also, you should really run a series of tests, where you add to the workload on each test until you reach the limit of the acceptable level of Avg Latency, say 40 MS. You can do this by increasing the number of threads on each test.

You also need to run tests for sequential read, sequential write, random read, and random write. Random read and random write are probably the most important for SQL Server.

Also, run these tests with different block sizes, say 4K, 8K, 16K, 32K, and 64K. 64K is probably the most important for SQL Server.

One final suggestion: setup a standard test suite and perform it on every new SQL Server before you put it into production. Make sure the IO performance is OK before you go live. Save the test results for each server so that you can refer to them later and compare them to a new series of tests if there are problems later. An Excel worksheet is a good place to store the data.

I know this seems like a lot of work, but it can save a lot of trouble in the long run, and it gives you hard numbers to show there is a problem. I have managed hundreds of servers and I can tell you from experience that disk IO performance problems are not uncommon, especially with SAN.
















Post #1444459
« Prev Topic | Next Topic »

Add to briefcase 12»»

Permissions Expand / Collapse