Troubleshooting with. wait stats (last 2 weeks included)

  • I'm troubleshooting a slow SQL Server. Our server team is adamant it doesn't need more memory or CPU. I've been working through the wait stats, but can't figure out where to hunt next. Here is a link to an Excel sheet with stats, and below is a description of the tabs:

    https://www.dropbox.com/s/mg83ru9zv4rc9o6/waitstats_db.xlsx?dl=0

    WaitStats - I've been running Paul Randall's wait stat script for the past couple of weeks. This is the result of the top 4 wait stats during that time period by day.

    sys.configurations - Current configurations

    sp_BlitzFirst - Brent Ozar's script with 30 second sampling

    If anyone has any advice or direction, it is much appreciated. One caveat - This app was built by a different company, so some of my options for rewriting procedures, etc. are limited.

  • When you say a "slow server", is it every query that is slow?  Is it a specific stored procedure or view that is slow or everything you do on the server?

    What does the CPU and memory in use look like (perfmon will help you monitor that)?

    Are you sure it is the database that is slow and not the app or client machines accessing the app?

    If you have a test version of the database, I would spin up profiler or an extended events session on the database, connect from the test version of the app to the test database with minimal connections (ie just you) and see what is actually getting run against the database and how long the database queries are actually taking.

    I had our RnD guys come to me telling me the database was slow before and when I dug into it more, it turned out they needed to redo some indexing stuff with elasticsearch on the server and nothing needed to be done on the database side.

    Failing that, I would reach out to the app vendor to see if they have run into something similar to what you are seeing.  It could be something simple like compatability level of the database or a new version of the app, or it could be something a bit more complicated.  Software vendor would be my first stop though; they know their applications and will be the most likely to point you in the right direction.  If they prove to be less than useful, I would try to narrow down what exatly is slow to the best of my ability (SQL, app, network, disk, memory, CPU, etc).

    If it is a VM, you could always ask your server team to toss more memory and CPU at it briefly as a proof of concept.  Looks like you are configured to use roughly 122 GB of memory.  Do you have enough free memory for the OS and other things running on the server (antivirus, other SQL instances, etc)?

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • I just realized I misread that configuration... it isn't 122 GB of memory, it is 12.2 GB of memory.

    Depending on database size, that may be too small, but I'd still do perfmon to see how much is in use at any given time.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • Thanks a bunch! This is helpful, and I will work through your recommendations to see what I find.

  • Taking a look at your dropbox, waits #2 and #3 both could point to problem with I/O and slow disks.  Are you reading and writing to fast disks?  Your number one Wait  is  CXPacket.  This wait indicates processors are waiting on others to finish for a query when it goes parallel which is not efficient.  Do you have MAXDOP and CTFP set to recommended values or the defaults?

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply