Performance Tuning Using Extended Events: Part 1

  • Comments posted to this topic are about the item Performance Tuning Using Extended Events: Part 1

  • bdavey - Wednesday, April 19, 2017 10:53 PM

    Comments posted to this topic are about the item Performance Tuning Using Extended Events: Part 1

    First, thank for the great article.
    So when can we expect parts 2 & 3 bdavey?

  • Thanks andrew04.
    The next part is coming out next Thursday and the final one on the Thursday after that.

  • So I've been thinking about and wanting to use XE for a while but as yet have not gotten comfortable enough with them to use them in production instances.  Are these four sessions lightweight enough to leave running all the time on production boxes like one might with Perfmon counters or are they the type of thing you enable after the fact to perform more effective troubleshooting?

  • Hi kris386,

    I've been using these xe's for many years on my production oltp & olap servers and have not had any issues. They are light weight and I wouldn't expect you to have any problems.

  • Very cool.  I'll give this a go on a few servers and see how things work out.  

    BTW, in the beginning of your post under the file size discussion you mention a total of 

    3/4TB of event data
    Should that not be 3/4GB?

    Thank you for a solid series of posts.

  • kris386 - Thursday, May 25, 2017 9:18 AM

    Very cool.  I'll give this a go on a few servers and see how things work out.  

    BTW, in the beginning of your post under the file size discussion you mention a total of 

    3/4TB of event data

    Should that not be 3/4GB?

    Thank you for a solid series of posts.

    Yep, you are right. It should be 3/4 GB. 
    Thanks for the catch and I'm glad you found this useful.

  • Good article (and yeah, I know it's older, I just saw it). I'm a huge proponent of Extended Events, so it's awesome to see them getting promoted.

    One note, the deadlock event may not be necessary on most systems because the system_health session (running by default on all machines) has it included. I just wanted to mention that.

    ----------------------------------------------------
    The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood...
    Theodore Roosevelt

    The Scary DBA
    Author of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd Edition
    Product Evangelist for Red Gate Software

  • Thanks Grant! I'm really excited about the articles getting a second life and am glad to see feedback.

  • I'm hoping that you guys can point a finger in the right direction for me.  We're bouncing on the ceiling for CPU utilization for our prodSQL.  I'm researching ways to take pressure off the CPU utilization so that we can get a new backup agent running.  My colleague has suggested raising the Cost Threshold for Parallelism from 50 to 500.  There are 20 threads overall, on the server with 2 CPU's. The MaxDOP is set to 10, since there are 2 NUMA nodes of 10 threads each.

    I'm trying to figure the best way to measure CPU utilization before and after the change.  I figure one Extended Events measure I should use is CXPacket wait times.  Can you guys suggest other measures, and steer me toward a good blog post to write the TSQL?

    Thanks!

    Brandon.Forest@purple.us

    Senior SQL DBA

    Purple Communications

  • I'm curious - do you have 2 processors with 5 cores each with HT active? or 2 * 10 cores = 20 HT?

    Regardless of it I would probably change that maxdop to another value lower than 10 - do have a look at https://littlekendra.com/2016/07/14/max-degree-of-parallelism-cost-threshold-for-parallelism/ for some info.

    Regarding changing CTP from 50 to 500 - did you look at your p,lan cache to see if that will even make any difference? in one of my systems half the queries had a cost below 25, and the other half above 900 so on my case that change your colleague proposed would not make a dent.

    Have a look at https://www.scarydba.com/2017/02/28/determining-the-cost-threshold-for-parallelism/ for some guidance on how to get the values - and potentially the best values for your system.

     

    As for CXPacket - in my opinion that will not help at all - just tells you that something was parallel and does not really give you your cpu usage. Perfmon for this and a SQL Monitoring tool would be your best bets. (and rewriting your queries if most go 100% cpu for a long time)

     

  • Which version of SQL Server are we talking about because CXPACKET waits either mean nothing at all, or are strong indicators of CPU use.

    Regardless, if I were you, you know you have CPU problems. Focus there, not on waits, but on resource use. Extended event monitoring of individual queries to determine which ones are using the resources, CPU in this case.

    As noted, I wouldn't change the cost threshold unless you determine it will be helpful based on data, not an arbitrary number. Glad my blog post is seeing some use. I wish a lot more people used it.

    ----------------------------------------------------
    The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood...
    Theodore Roosevelt

    The Scary DBA
    Author of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd Edition
    Product Evangelist for Red Gate Software

  • Grant,  Thanks for the reply.  Our Production SQL (prodSQL) is a SQL 2012 instance.

    I agree.  Don't make uninformed decisions.  To that end, I've created an Extended Events session I called "CXPacketWaits".

    I have 6 events in the session:

    1) cpu_threshold_exceeded,

    2) degree_of_parallelism,

    3) perfobject_processor,

    4) query_memory_grant_usage,

    5) wait_info,

    6) wait_info_external.

    The filter for 2) is > 0

    The filter for 5) & 6) is > 1

    I ran the session for 2 hours today (12/9) during peak business hours.  I'm going to run it again at peak hours tomorrow.  I have an RFC open for tomorrow night to change the Cost Threshold for Parallelism from 50 to 500.  I'm going to do two post-implementation EE sessions with the CXPacketWaits template.  I hope to see a smaller degree_of_parallelism value.  What other measures can you recommend for me to monitor?

    Cheers!

    Brandon.Forest@purple.us

    Attachments:
    You must be logged in to view attached files.
  • 2012 supports causality tracking right? I'd add that to the extended event so you can see these things grouped.

    ----------------------------------------------------
    The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood...
    Theodore Roosevelt

    The Scary DBA
    Author of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd Edition
    Product Evangelist for Red Gate Software

Viewing 14 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply