Hardware or Bad Queries = Performance issues?

  • Help Please!!

    How can one identify whether performance issues on a SQL 2008 R2 server are being caused by Hardware or the queries being written inefficently?

    I have looked at a lot of DMV data for a server and don't think particular issues are because of hardware. However what should I look at that would point me in the correct direction and can tackle the supplier to resolve the issues with their application?

    Any advice on the above will be highly appreciated.

    Thanking you for your help

    Kailash.

  • What kind of hardware do you have running?

    If you suspect bad queries, look for ones that have especially high reads/long durations, as these may be potentially missing indexes (insufficient indexes), check the server/query waits to see which are the highest. For instance, high PAGEIOLATCH_EX "may" indicate issues with the disk subsystem (hardware) because it's waiting to look up the page from the disk, CXPACKETS may indicate poorly written queries (or ones that have large table scans in them), and so forth.

    This is such a loaded question, what specifically have you looked at?

    FYI - if you're confident that your indexes are solid and your statistics are up-to-date, then most often you can dig deeper into the hardware configuration (RAM, spindles, etc)

    ______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience

  • When in doubt, assume bad queries and poor indexing.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • @Gail +1,000,000 - that's 99% the case here at our shop 🙂

    ______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience

  • MyDoggieJessie (12/21/2012)


    For instance, high PAGEIOLATCH_EX "may" indicate issues with the disk subsystem (hardware) because it's waiting to look up the page from the disk, CXPACKETS may indicate poorly written queries (or ones that have large table scans in them), and so forth.

    On the other hand, PageIOLatch waits may indicate queries that are requesting far too much data (table or index scans) and CXPacket may indicate that you have lots of queries benefiting from parallelism

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (12/21/2012)


    MyDoggieJessie (12/21/2012)


    For instance, high PAGEIOLATCH_EX "may" indicate issues with the disk subsystem (hardware) because it's waiting to look up the page from the disk, CXPACKETS may indicate poorly written queries (or ones that have large table scans in them), and so forth.

    On the other hand, PageIOLatch waits may indicate queries that are requesting far too much data (table or index scans) and CXPacket may indicate that you have lots of queries benefiting from parallelism

    LOL, so those waits really don't tell a person much!

    I've found some queries that tell me what's running when an issue arises. I also check the standard reports for top queries in regards to CPU and IO. Then you can try to work on those queries.

    Does anyone else do anything other than that to find issues on servers if you're not even sure issues exist?

  • scogeb (12/21/2012)


    LOL, so those waits really don't tell a person much!

    Well excessive PageIOLatch indicates that the IO subsystem is been driven harder than it can handle. Whether that's because too much is being asked or because the hardware is inadequate requires further investigation.

    CXPacket just indicates that queries are running in parallel. It's a wait a lot of people misinterpret.

    Try http://www.simple-talk.com/books/sql-books/troubleshooting-sql-server-a-guide-for-the-accidental-dba/, start with chapter 1.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Taking this with a grain of salt and the "it depends" answer, another way to see if you have disk contention is to track the average disk queue length for the disks - this can often tell you if the bottleneck is occurring within your disk subsystem (of course as Gail has already mentioned, always assume bad indexes/stats).

    I've seen disk queuing become excessive when a query was performing hundreds (sometimes millions) of reads on a table due to an clustered index scan on a table with 30+ million rows. A little magical index tweaking later and the result was a 100% non-clustered index seek, and the reads went down to < 100K, and the disk queuing went significantly down.

    Again, without knowing what your hardware architecture is, it's difficult to really help you out. Our suggestions could be night/day difference on a server with 4 spindles in a RAID 5 versus a server with a SAN containing 64 spindles. Performance in these cases (i.e. like where you should be concentrating your efforts), is relative.

    ______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience

  • MyDoggieJessie (12/21/2012)


    Taking this with a grain of salt and the "it depends" answer, another way to see if you have disk contention is to track the average disk queue length for the disks

    Sorry to pick on you today, but disk queue length is a near-meaningless counter these days. There's too much between the server and the disks to get a sensible interpretation of that counter (unless you're dealing with direct attached drives), plus SQL is designed to intentionally queue up multiple IOs, thus sending the queue length very high (read ahead reads).

    The avg sec/read and avg sec/write are a lot easier to interpret and track

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • It's all good:-D I appreciate constructive criticism(s)

    ______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience

  • Thank you all for your valued advice. The server in question is a virtual server with SAN connected disks. What I find is that queries causing most IO are the ones for the application. I see a huge amount of logical reads for these queries with some having a large amount of logical writes too, however corresponding physical reads are fairly low the largest being 3772.

    Following is a brief snapshot of the top five queries that have executed on the server -

    avg_cpu_cost, execution_count,total_worker_time

    12549805,1, 12549805

    1962439, 52, 102046877

    26721719,25, 668042982

    666357, 20, 13327149

    11479003,6, 68874022

    The max “Average IO stall” per millisecond is roughly 35ms, however I do see a loads of Latch waits as below –

    wait_type, waiting_tasks_count, wait_time_ms, signal_wait_time_ms, io_wait_time_ms

    PAGEIOLATCH_EX2831, 27461,288, 27173

    PAGEIOLATCH_SH77799, 924785,5080, 919705

    PAGEIOLATCH_UP1073, 6299, 182, 6117

    The top 10 waits on the server are as follows –

    wait_type, waiting_tasks_count, max_wait_time_ms, wait_time_s

    BROKER_EVENTHANDLER, 11, 170845047, 178478.416

    SOS_SCHEDULER_YIELD,8474925,187,9699.191

    BROKER_RECEIVE_WAITFOR,4,594234,1199.999

    PAGELATCH_EX,522662,193,579.588

    PAGEIOLATCH_SH,47005,4249,430.677

    OLEDB,5523944,109265,429.257

    ASYNC_IO_COMPLETION,32,143379,254.307

    BACKUPBUFFER,13223,1741,197.967

    BACKUPIO,8601,554,191.144

    CXPACKET,4887,4727,115.389

    Please let me know if you find anything that I could use to pin the supplier down with?

    Thank you for all your help

    Kailash.

  • I may be wrong, so if I am please tell me so. Aren't logical reads reads from data cached in memory? Would these be good reads as SQL Server doesn't have to go to disk to get the data?

  • Lynn Pettis (12/24/2012)


    Aren't logical reads reads from data cached in memory?

    Yes

    Would these be good reads as SQL Server doesn't have to go to disk to get the data?

    Maybe. 🙂

    All reads done are logical (the QP knows nothing about a disk or a file). Some reads may additionally be physical reads.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Lynn Pettis (12/24/2012)


    I may be wrong, so if I am please tell me so. Aren't logical reads reads from data cached in memory?

    Like Gail said... Yes.

    Would these be good reads as SQL Server doesn't have to go to disk to get the data?

    Maybe not. For example, consider the lowly Triangular Join. Do you believe that reading the same rows in an N2/2 fashion is a good thing just because the data is in memory? Do you think that trying to load all of the columns and rows of a 140 column 14 million row table into memory (even on purpose) is a really good idea if it can be avoided.

    Of course you don't. But those would definitely be considered to be "bad reads" in my book even if they were on an SSD.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (12/26/2012)


    Lynn Pettis (12/24/2012)


    I may be wrong, so if I am please tell me so. Aren't logical reads reads from data cached in memory?

    Like Gail said... Yes.

    Would these be good reads as SQL Server doesn't have to go to disk to get the data?

    Maybe not. For example, consider the lowly Triangular Join. Do you believe that reading the same rows in an N2/2 fashion is a good thing just because the data is in memory? Do you think that trying to load all of the columns and rows of a 140 column 14 million row table into memory (even on purpose) is a really good idea if it can be avoided.

    Of course you don't. But those would definitely be considered to be "bad reads" in my book even if they were on an SSD.

    So the second one is "It depends."

    And Jeff, you are correct, I would not consider a triangular join good in an case.

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply