There is one thing we all have in common, and a lot of us probably don’t even realize it, baselines!
While this is a blog about SQL Server, baseline monitoring isn’t something only DBA’s do. Actually everyone uses baselines in one way or the other. For instance, when you receive your salary you expect it to be a certain amount, if you receive less (or more) then the expected amount you will probably take action and call your boss. This example shows us our baseline value, our ‘normal’ salary, and a change (more or less salary) that impacted our baseline.
While we use baselines for just about everything in our lives I still notice large groups of IT personal either not using them, or using them completely wrong. Baselines are incredibly powerful measurements that are relatively simple to implement and understand, so why isn’t everyone using them?
In this article I’ll discuss how baselines work and how you can apply them to your work. I won’t be discussing how to capture them, I already wrote an article on capturing wait statistic baselines here, this article is more a theoretical approach to baselines.
What are baselines?
A simple way of defining a baseline is the “standard” or the “normal” situation. This basically means the situation in which everything is running normally without any strange exceptions. If we use our example of your salary again, the “standard” situation would be if you receive the correct salary on the agreed interval (every week, month etc.). Next to being the “standard” or “normal” situation a baseline can also be defined as the “starting point” of our measurements.
In most cases we will compare subsequent measurements against our baseline to see if a change has a positive or negative impact on the subject we are measuring.
Why use a baseline?
As you read in the previous chapter a baseline is a measure tool to monitor the impact of changes. As such it will help you determine if your (planned or unplanned) changes have any impact on the thing you are measuring.
An example from our own DBA field of work:
Say we have a database system that is being used by 500 users on a daily basis. Users never have any problems using the database and don’t run into performance related problems. One day a developer changes a table, adds some indexes and updates some views. These changes happen through an update script the developer ran. A few days after the update the users, who normally are happy, start to complain about performance issues.
In this example we defined the baseline as 500 happy users who don’t run into performance issues while working with our database. If we compare the new situation against our baseline we can agree the situation has changed, users run into performance issues and are unhappy. Since we have our baseline of the “normal” situation we can conclude the update has a negative impact on our baseline and maybe the update should be reverted.
If we didn’t have our initial baseline we wouldn’t know users where happy before the update and we could assume the update is working fine. Even if a user complains the performance is worse then before the update we wouldn’t have a facts to check that the user is right since we don’t have our baseline.
This is a pretty abstract example, normally we would be gathering baseline data of all types of important SQL Server performance counters (or at least I hope you do) so we can identify the exact cause of why the performance has decreased after the update.
What is a useful baseline measurement?
If you want to use a baseline you need to make sure it is meaningful and useful to measure. Make sure you understand the baseline values you are monitoring! Not understanding what you are actually measuring can result in wrong conclusions.
While we can just baseline everything that is going on, you should ask yourself the question if it is really needed to capture all these measurements. Gathering too much information can blind you in cases when you need to get to the bottom of a problem fast. Focus on baselines that can give you an answer fast and reliable!
You should also create your baselines for multiple periods of time. For instance, instead of just building a baseline by using an average of a complete week it can be interesting to exclude the weekend because there are no users in the weekends. If you would compare a measurement you did on Monday to that average baseline you could see the measurement is a lot higher then your baseline because the average baseline includes the weekend when nobody is working!
Comparing to your baseline
There are a variety of different types of baselines you can build from your data. Some of them are very easy to compare to while others are more difficult.
For these baseline examples I’ll use a SQL Server performance counter.
A stable baseline is one of the easiest baselines to compare your data with.
The image bellow shows an example of a stable baseline value of transactions/sec:
Even though the values we captured fluctuate slightly we can see a pretty straight line in our graph. If we need to compare measurements against the baseline we can quickly identify if there is an increase or decrease. To illustrate this take a look at the image bellow:
As you can see the orange line we are comparing to our baseline has a higher value then the baseline. Because our baseline is pretty stable we can immediately notice this in the graph.
While a stable baseline is very easy to compare your measurements against, in reality you will most likely see a variable baseline as illustrated by the image bellow:
When comparing data against a baseline like this you usually end up seeing the spikes on other measuring points like the image below:
To find out if there is a problem when comparing it to the baseline you should try and find trends in the measure data if possible. For instance: the spike at capture point 6 returns in the baseline but also in the measurement we are comparing, is this spike always returning on point 6 or is this just a coincidence? Analyzing those trends will help you understand if your attention is actually needed or if it is just “normal” behavior.
Another way to detect if your attention is needed is by setting high and low values together with your baseline. For instance: If you know your users experience problems when the Transaction/sec measurement reaches 115 you should set your high value as 115. When a measurements exceeds this value you can quickly identify it.
As you can see the spike at data point 6 exceeds the high value and could be reason to investigate.
Hopefully this article gave you a little more insight into baselines and how you can use them in your work. Baselines can be very complex and a great understanding of the environment you are building your baseline in is needed to make it meaningful.
Try to start building your baselines when you start, and build new ones after changes. This way you can always compare the impact of changes to your previous baselines.