Performance monitoring SQL Server and vendor applications

  • Hi,

    We are a team of 6 DBA's for a government district servicing 6 vendor applications. Our servers are virtualized and consist of 2008 R2 and one legacy SQL 2000 on a physical box. In July of this year, our IT dept invested heavily in a PURE San Storage. This was from an EMC San solution.

    Since this time, one of our applications suffers from periodic degraded performance. We have been working with the vendor but have found nothing wrong. The director, based on the vendors recommendation, has asked us not to use SQL Server Profiler ....

    My question: "What tools might we consider to help? Solarwinds Database Performance Analyzer - VM option claims to have a close to zero footprint. Does anyone have other suggestions of what we can try?"

    John

    SQL 2012 Standard VPS Windows 2012 Server Standard

  • Johnny B (11/23/2015)


    Hi,

    We are a team of 6 DBA's for a government district servicing 6 vendor applications. Our servers are virtualized and consist of 2008 R2 and one legacy SQL 2000 on a physical box. In July of this year, our IT dept invested heavily in a PURE San Storage. This was from an EMC San solution.

    Since this time, one of our applications suffers from periodic degraded performance. We have been working with the vendor but have found nothing wrong. The director, based on the vendors recommendation, has asked us not to use SQL Server Profiler ....

    My question: "What tools might we consider to help? Solarwinds Database Performance Analyzer - VM option claims to have a close to zero footprint. Does anyone have other suggestions of what we can try?"

    John

    Tell your director, and the vendors, to get their heads out of their collective butts. 🙂 I have used profiler traces to local on systems with many thousands of transactions per second with very little overhead on the box. And once you have traced to disk you can analyze the files to do the all-important aggregate profiler analyses to find the things that REALLY need to be tuned - i.e. that query that takes just 0.4 seconds but gets called 150 times a minute, etc.

    You can use them graphically too. But you need to use them correctly to be able to do this.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • If the vendor failed to present the evidence described in http://blogs.msdn.com/b/sqlmeditation/archive/2012/12/12/meditation-on-sql-trace-performance-impact.aspx, the vendor is spreading FUD. Because tracing is being replaced by extended events, MS is unlikely to make modifications to its architecture or its wait types. I have seen significant TRACEWRITE + OLEDB waits in SQL Server 2012, so I slapped the only sysadmin who is foolish enough to trace everything but the kitchen sink via SQL Server Profiler (a client side trace).

    When tracing, it all comes down to ensuring that the event generation rate does not exceed the event consumption rate. When the event generation is greater, locks pile up in SQL Server until the client (with a client side or a server side trace) consumes the results. Some events (such as Locks:Acquired and Locks:Released) are not very big (XXX bytes) but are extremely chatty. They can cause the event generation to skyrocket, but otherwise are not too useful. Other events, such as Performance:Showplan XML Statistics Profile, can be extremely large (XXKB), are run once per statement. They can cause the event generation to skyrocket, but can be useful. The problem with SQL Server Profiler or client-side tracing is that the rate at which events are generated is not very obvious (they scroll by too quickly and SQL Server Profiler's IO & memory consumption are often overlooked). A server-side trace would allow you to watch file sizes (and its disk performance). I trace over 2000 batches/sec to a 100GB SSD, dedicated for tracing needs. Even so, I am careful when a problem concerns high CPU consumption (unexpected high rates of execution is one my systems' more common "problems").

    The oddest situation I ever came across was a team that decided to do a "server-side" trace to a network share. They filtered their trace so that "only" durations longer than 2 minutes would be sent to the share. They didn't care about SQL Server being busy, because an event must first be generated, only after which can it be filtered. Their problem was ASYNC_NETWORKIO waits. As their network bandwidth got constrained, more statements started waiting on clients to fetch (ASYNC_NETWORK_IO), so those statements blocked other statements, thus more statements took longer than 2 minutes, hence more tracing to their share was needed. And the twist was that their share also suffered from the same lack of network bandwidth.... That positive feedback loop brought their SQL Server down in a matter of minutes. They had to stop tracing to a share, and (more importantly, IMHO) they had to increase their network bandwidth. A similar problem can happen if a trace is done to the same disk (or controller) that is being used to serve database or log file IO, when the salient performance bottleneck concerns disk IO (PAGEIOLATCH%, WRITELOG, waits etc.).

    If you were to call MS with a performance problem, they will send you the internal version of pssdiag, which is available on codeplex.com. Pssdiag can harness some extended events, but server-side tracing still handles the lion's share of diagnostic data collection (a perfmon log and output of 10 second polled DMVs are its "other two legs"). Pssdiag's general performance template collects batch-level, not statement level events. It's detailed template collects statement level events plus Showplan Statistics Profile (the least expensive Performance Profile event). You upload that data to Microsoft. Codeplex's SQL Nexus (run on a upon-production SQL Server box) uses readtrace to consume and aggregate what pssdiag collected, and presents summaries as graphs and tables (which is usually MS sends back to you :). I would be careful about using anything but pssdiag's general performance template on a busy 16-way or greater (SharePoint being one system that does not always behave well when detailed traces are run).

    As far as "less intrusive" monitoring tools go, you will find a variety of suggestions and I have no preference for one over another. But given this web site's sponsor I suspect you will find a predilection to suggest redgate. In all honesty, which works best depends upon your needs and your system. Take a few out for a free test drive - see which you like. You should also ask your vendor what they use, and then choose a different one. That way, there is less chance of the wool being pulled over both their eyes and yours :cool:.

    And don't forget there are two sides to a performance "coin": Inadequate supply versus excessive demand. Best to not take sides. Strive to determine which is cheaper or quicker to fix (within a budget).

  • Johnny, If you are looking for a third party tool, SolarWinds DPA is a good one. I currently use it and it is helpful. Not only will it show you expensive queries, when they ran and what they were waiting on, but you can set up alerts from it to tell you when certain blocking thresholds are being exceeded. There are built in Dynamic Management Views that can give you similar information when you run them during a period of degraded performance. You can also set up the Blocked Process report and use Michale Swartz's Blocked Process Report Viewer to make sense of the report.

  • I have to agree with Kevin. I don't see what's wrong with running a trace either, but if that's the edict you have to live with, so be it. You can either try to get the boss to change his mind or live with it.

    I use SQL Sentry to monitor my servers. It shows a bunch of graphs and you can go back in history to answer the question "We got an email from some user on the other side of the planet. What happened at 3:42 AM today?" You can identify bottlenecks and go down to the level of what SQL was running at the time. It does the job.

    You can also use the DMVs yourself to look for poor performing queries.

  • I want to thank everyone for the replies. This has been a frustrating and political situation. The organization has quieted back into a reactive rather than responsive or proactive stance. I've downloaded the DPA free version and will look at the other tools. I will also try and get statement from the vendor about the traces. It is absurd.

    I'm curious, why no one mentioned extended events. Are they too new and unfamiliar to folks or are they not entirely a mature strategy?

    John

    SQL 2012 Standard VPS Windows 2012 Server Standard

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply