I have mentioned predicate order previously and wanted to make sure I drive home the importance of the predicate order when dealing with Extended Events. This is an important topic because it can be very helpful in maintaining a reasonably performing event monitoring system.
The proper order of a predicate will be the difference between enabling the Event Engine to short circuit an event or whether it will cause it to labor needlessly and add to the monitoring overhead.
To illustrate this more clearly, I want to first evaluate how a water filtration system works. I will try to do the image justice, and bear with me for a bit.
Within a water filtration system, there are several filters in place. Each filter helps to remove a different set of particulates. Each progressive filter is designed to remove subsequently smaller particulates.
So, in the design of water filtration, at the top level, the filter is nearly wide open allowing all but the largest particulates to flow throw to the next level. This is the method employed at each level of the filtering process until at the end we have nothing but water.
If one were to employ that design in an event monitoring system, this would equate to the funnel design where the largest quantity of events possible are allowed through the first filter. This is a close assimilation of what we see with Profiler. Let everything in to the filtering system and then eliminate the least matched items first until the criteria most likely to fail (or most unique) criteria are finally filtered.
That sort of methodology works well for a water filtering system because it really needs to filter out the larger items first as a matter of efficiency. Within a monitoring system, it is
more useful to eliminate the finer grained events first. Let me try to explain while using this next image.
In this type of approach, the largest items are excluded first. This isn’t large as in size since we are talking about streams of data. But rather we try to first eliminate the items that are least likely to be part of the desired result. Then the filtering becomes less restrictive as we flow down through the system.
By eliminating the larger amounts of noise first, this means less items flow through the filter to the next level of the filtering system. This is the method employed by Extended Events. I will explain in more detail in just a moment.
First, let’s examine briefly the processing steps within Extended Events.
Notice, from this layout, that the predicate is evaluated third. This evaluation occurs prior to the application of the action data and prior to allowing the data to hit the targets. If the predicate were to be evaluated after actions, one could imagine that the overhead would become a little bit higher. And should predicates be evaluated after writing to the targets, the overhead would again be higher.
That’s a good optimization for the event engine. Short-circuiting of events helps to further reduce the load in that third step because it allows even fewer events down the filter process before they can even get on to that next step in the overall process.
If I have a server with 500 databases and I am interested in finding all of the queries that happen to be taking more than 25 seconds to complete for my ClientOrders database, I could try to find all of those through Extended Events through a couple of means. First, with very little attention to the order of the predicate:
/* ClientOrder database id = 113 */ ADD EVENT sp_statement_completed (WHERE duration > 25000000 AND source_database_id = 113)
Given that I have 500 databases on this instance, I could have a pretty large set of events that would now need to flow through to the next predicate. Let’s just assume that I am managing an extremely poorly performing Instance and all databases have frequent occurrences of queries taking more than 25 seconds. Since only one Business Unit really cares about how poorly their application performs, I really need to find their queries that are performing terribly – and I must do so with minimal impact to the server.
Now, if I write the predicate as follows:
/* ClientOrder database id = 113 */ ADD EVENT sp_statement_completed ( ACTION ( package0.callstack, sqlserver.client_app_name) WHERE (source_database_id = 113 AND duration > 25000000) );
I have decreased the flow of events through the filtering process that would reach the second filter phase. This can lead to a more lower impact on the instance that may already be struggling given the problems I have already described.
How have I decreased the flow of events through the filtering process? This is the short-circuit in action. More accurately, Extended Events will evaluate the order of the predicate in the order that it is declared. By putting the most restrictive (or most likely to fail) predicate first in my predicate declaration, it will be evaluated first and subsequently short-circuit the event evaluation process whenever my database id does not match.
With all databases having performance issues and likely to have queries that exceed 25 seconds, my best option to short-circuit was to single out the database first since it is only 1 of 500 databases on the instance. This short-circuiting is sure to be a departure from expectation for many database professionals that are accustomed to the Query Optimizer re-ordering the predicates of queries automatically.
In the second example, I also showed the addition of an action to the event. I did this to underscore the processing steps. With predicates being evaluated prior to action data being attached, this means that action data does not get attached to unwanted events and subsequently causing performance issues due to that. This is a second degree of short-circuiting.
A good predicate in Extended Events will be of great use in your troubleshooting and monitoring efforts.
Stay tuned for more on Extended Events. If you have missed any articles in this series, please check out the Table of Contents for the series.