SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 

Comprehensive Baseline Collector Solution (Updated)

By Robert Virag, (first published: 2014/10/21)

UpdateNew version of Baseline Collector Solution (V1.2) is available.

Changes in V1.2 - 2015.05

Support
Please notice: For BCS V1.* no new features are coming, only bugfixes!!!
BugFix 
  • sp_CollectDatabaseInfo - table [database_info] collation_name data type is changed to NVARCHAR(128) to allow NULL values in case of the database is not in ONLINE state or AUTO CLOSE option is ON.
  • sp_CollectTempDBUsage - no comment... just: "shame on me" :(

Thank all of you for the feedbacks!

New features coming in  Baseline Collector Solution V2

  • Easier collector management with Central Configuration
  • More effective, re-worked sp_CollectFileInfo ;)

Changes in V1.1 - 2015.04

  • New collector: sp_CollectInstanceInfo
  • Additional schedules are added
  • Bugfixes

Collecting baseline data is essential. Questions usually come up regarding baselines are the follows:

  • Why is collecting baseline data necessary, useful?
  • What should be collected?
  • How should it be collected?
  • When, how often should it be collected?
  • Where to store the data collected?
  • How long should it be stored?

I won't answer these questions in this article. At least not all of them. There are lots of very good article in this topic which focus on all of these questions. One of the best article series in this topic which I really recommend checking is written by Erin Stellato (5 Reasons You Must Start Capturing Baseline Data)

The main goal of this article is to introduce a free comprehensive solution for collecting baseline.

The WHY

Among the tons of good answers, my favorite is TROUBLESHOOTING. Just imagine (hopefully really just imagine) yourself in the middle of a situation when you need to check an SQL Server Instance which has performance problem. It is much harder to investigate what the real problem is, if you don't have data from the past when everything worked fine. In cases like this a performance related value (almost whatever it is) is just a value with very little message, meaning. It can be good, good enough or bad...

I would like to emphasize that this is just my favorite. I don't like problems, but I like to solve them :)

The Rules

Before I started working out my comprehensive Baseline Collector Solution (BCS), first I had laid down some rules, some principles to follow:

  1. collect only raw data
  2. collect as little data as possible
  3. be as flexible as possible 

Rule No. 1

This solution collects and stores raw data. No calculations, no analysis. Everyone has their own query to analyze the performance related data. You can still use it. The main idea behind this rule is to let everyone do their own calculations without restrictions.

Rule No. 2

Collecting baseline continuously requires lot of storage space. So I tried to collect as little data as possible. Fortunately if you follow Rule No. 1 it is easy since storing only raw data means storing less data compared to calculated data. Consider the following easy example: wait statistics

If you only store raw data you should store the wait_type, waiting_tasks_count, wait_time_ms, max_wait_time_ms, signal_wait_time_ms and the time of the collection of course. In case you would like to store already calculated data you may also would like to store different averages like avg_signal, avg_wait, avg_res. You probably would like to calculate the percentage of the wait types as well. These all mean more and more data to store.

Last but not least if you only store raw data then you have the possibility to do your own calculations on data from different time intervals not just one.

To reduce the necessary space for storing baseline data even more, if data compression is available page compression is enabled on the tables. Fortunately baseline data is a very good candidate for data compression.

Every stored procedure (collector) has a retention parameter so you can easily control the data cleaning.

Rule No. 3

This one caused me lot of headache, lot of thinking. I was faced with the dilemma of choosing between the so called 'Standard' or 'Customizable' solution. I really hope that I managed to find the 'golden mean'. As a result the BCS is usable in long term, short term and in a very short term let's say troubleshooting scenarios.

In the case of those collectors where it is reasonable there is a possibility to run them in troubleshooting mode (TSMode). This only means that instead of keeping collecting the data to the standard baseline tables (see later) and making the data dirty it collects the data into a separate table. In TSMode you can define the destination table, but the destination table schema, design must be appropriate, and it must exist.

To make the TSMode more effective, more flexible you can define how long you want to do the data collection (CollectingInterval), how frequently you want to take a sample (SampleInterval) and in the case of collectors when it is reasonable - for example performance counters - the time interval of the measurement (MeasuringInterval) as well.

The WHAT

The baseline solution consists of several collectors which can collect the following data:

  • Server Configuration Data
  • Database Info
  • Database File Info
  • SQL Server Related Performance Counters
  • Wait Statistics
  • IO Virtual File Statistics
  • TempDB Usage

Solution Schema

There are three types of tables. Standard tables without a prefix are for storing the baseline data.  You can see these in the image below.

Default tables for troubleshooting mode with the same names as the standard tables plus having 'ts_' prefix. These 'ts_*' tables are used when the TSMode is used but the destination table is not defined explicitly.

Tables with 'filter_' prefix are used by the collectors for filtering purpose.

Filter tables

In the case of some collectors you can use filter tables for narrowing down the data collected. There are three types of filter tables. Include, exclude and mixed (exclude-include) filter table.

Filter tables which are used by a collector which has TSMode there is a separate column for filtering in standard mode /is_excluded or is_included/ and for filtering in TSMode /ts_is_excluded or ts_is_included/.

Keep in mind that the filter tables need to be maintained manually. So for example when you create or add a new database you need to add the appropriate info to the [dbo].[filter_database_file] table as well in order to collect information from the new database or to be able to filter it out.

The Collectors

These are the stored procedures that perform the data collection.

Common Parameters:

@help: Display help messages.

@DestTable:In troubleshooting mode it specifies the output table. Table must exist. Can be NULL. 

@DestSchema: In troubleshooting mode it specifies the schema of the output table. Schema must exist. Can be NULL. Default value: dbo

@Retention: Specify the time, in days, after which the data collected are deleted. If no time is specified, then no data are deleted. Default value 90 days.

@CollectingInterval: Duration of the data collection in minutes. Maximum 1440 (1 day), so value has to be between 1 and 1440. It can be NULL. In that case the collector only takes one sample, the actual state of the data collected by the collector.

@SampleInterval: Times in seconds between taking two samples.

@MeasuringInterval: In the case when two measurements are needed it defines the time between the 1st and 2nd measurement of the data sampling for measuring data during the collecting interval.

@TSMode: Enable troubleshooting mode.

@EmptyTSTable: Empty the table used for troubleshooting before the data collection.

sp_CollectConfigData

Collect and store the SQL Server configuration data. It doesn't collect and store the data again and again when you run the collector (Rule Nr. 2). It only stores the data which are changed compared to the data stored or the data which are missing for any reason. For example it was cleaned or deleted manually.

There are three reasons for collecting and storing data:

  • The data does not exist in the baseline table - which is [dbo].[configuration_data] in this case - so there is no base information to compare to. In this situation the data need to be pre-initialized.
  • The current data and the stored one are different. That's why we do the baseline... :)
  • Data is deleted by the inside cleaning process and there is no any other occurrence of the data. In this case this data need to be added back, post-initialized.

To check why the data was collected and stored check the [is_initial] and the [is_pre] columns.

Baseline table: [dbo].[configuration_data]

TSMode is not available.

Example:

EXEC sp_CollectConfigData
@Retention = 90,
@BypassNonActiveSrvConfError = 1,
@help = 0,
@LogInfo = 1

sp_CollectDatabaseInfo

It works in the same way as sp_CollectConfigData.

Be careful though when you upgrade your instance, since this collector is based on the [sys].[databases] system table which is different in the versions of SQL Servers. So in the case of version upgrade you have to upgrade or recreate the [dbo].[database_info] table. The collector recognizes the instance version and collects the data accordingly.

Baseline table: [dbo].[database_info]

Example:

EXEC sp_CollectDatabaseInfo
@Retention = 90,
@help = 0,
@LogInfo = 1

sp_CollectFileInfo

Collect and store database file information including the space used in each file.

Baseline table for standard mode: [dbo].[file_info]

TSMode is available. Default baseline table for troubleshooting mode: [dbo].[ts_file_info]

It comes in handy for example when you would like to keep a close eye on a transaction log file.

Used filter table: [dbo].[filter_database_file]

Type of the filter table: mixed

  • TSMode: include
  • Standard mode: exclude

Related columns:

  • TSMode - [ts_is_included_file_info]
  • Standard mode - [is_excluded_file_info]

If you move one of the database file to different location you don't need to update the filter table since the filtering uses the database name and logical file name.

Example:

EXEC sp_CollectFileInfo
@DestTable = NULL,
@DestSchema = NULL,
@TSMode = 1,
@EmptyTSTable = 0,
@Retention = 90,
@help = 0,
@LogInfo = 1

sp_CollectIOVFStats

Collect and store IO statistcs based on the [sys].[dm_io_virtual_file_stats] dynamic management view.

During the collection, the second measurement of the current sample is considered as the first measurement of the next sample.

Baseline table for standard mode: [dbo].[iovf_stats]

TSMode is available. Default baseline table for troubleshooting mode: [dbo].[ts_iovf_stats]

Used filter table: [dbo].[filter_database_file]

Type of the filter table: mixed

  • TSMode: include
  • Standard mode: exclude

Related columns:

  • TSMode - [ts_is_included_iovf]
  • Standard mode - [is_excluded_iovf]

If you move one of the database file to different location you don't need to update the filter table since the filtering uses the database name and logical file name.

Example:

EXEC sp_CollectIOVFStats
@DestTable = NULL,
@DestSchema  = NULL,
@TSMode = 1,
@EmptyTSTable = 0,
@CollectingInterval  = 1,
@SampleInterval = 10,
@Retention = 90,
@help = 0,
@LogInfo = 1

sp_CollectPerfmonData

Collect and store SQL Server Instance Related Performance Counters based on the [sys].[dm_os_performance_counters] dynamic management view.

@MeasuringInterval unit: second

Parameter value of @SampleInterval must be larger than the parameter value of @MeasuringInterval.

Baseline table for standard mode: [dbo].[perfmon_data]

TSMode is available. Baseline table for troubleshooting mode: [dbo].[ts_perfmon_data]

Used filter table: [dbo].[filter_performance_counters]

Type of the filter table: Include

Related columns: 

  • TSMode - [ts_is_included]
  • Standard mode - [is_included]

Changes related to memory perfmon counters are implemented in the filtering table: Memory Manager surface area changes in SQL Server 2012

Example:

EXEC sp_CollectPerfmonData
@DestTable = NULL,
@DestSchema = NULL,
@TSMode = 1,
@EmptyTSTable = 0,
@CollectingInterval = 1,
@SampleInterval = 20,
@MeasuringInterval = 3,
@Retention = 90,
@help = 0,
@LogInfo = 1

sp_CollectTempDBUsage

Collect and store the TempDB usage based on the [sys].[dm_db_file_space_usage] dynamic management view.

Baseline table for standard mode: [dbo].[tempdb_usage]

TSMode is available. Baseline table for troubleshooting mode: [dbo].[ts_tempdb_usage]

Example:

EXEC sp_CollectTempDBUsage
@DestTable = NULL,
@DestSchema = NULL,
@TSMode = 1,
@EmptyTSTable = 0,
@CollectingInterval = 1,
@SampleInterval = 5,
@Retention = 90,
@help = 0,
@LogInfo = 1

sp_CollectWaitStats

Collect and store wait statistics based on the [sys].[dm_os_wait_stats] dynamic management view.

@ResetWaitStats: It does exacly what its name suggests. Clears the [sys].[dm_os_wait_stats] after the collecting phase.

@MeasuringInterval unit: minute

If @MeasuringInterval is NULL the current state of the waitstats are collected and stored. You can think of it as a 'Point-in-Time measuring'.

So for example you can run the collector in 'Point-in-Time measuring' mode, resetting the waitstats every day at the same time scheduled by the agent. In this case the data collected daily will contain wait statistics for 24 hours.

If you don't want to clear the waitstats or you only want to collect the waitstats for a specific business time interval then just run the collector in 'Interval measuring' mode, and make sure that the interval you specify cover the intended business interval.

You cannot run the collector with @ResetWaitStats enabled in 'Point-in-Time measuring' mode when another instance is running in 'Interval measuring' mode.

If @ResetWaitStats is enabled, it is changed automatically from 1 to 0 in 'Interval measuring' mode.

Baseline table for standard mode: [dbo].[wait_stats]

TSMode is available.  Default baseline table for troubleshooting mode: [dbo].[ts_wait_stats]

Used filter table: [dbo].[filter_wait_types]

Type of the filter table: Exclude

Related columns: 

  • TSMode - [ts_is_excluded]
  • Standard mode - [is_excluded]
The exclude list used is based on Paul Randal's code: Wait statistics, or please tell me where it hurts (2014 marc) 
Example:
EXEC sp_CollectWaitStats
@DestTable = NULL,
@DestSchema = NULL,
@ResetWaitStats = 0,
@MeasuringInterval = 1,
@TSMode = 1,
@EmptyTSTable = 0,
@Retention = 90,
@help = 0,
@LogInfo = 1

Additional remarks

The BaselineCollectorSolution.sql script can create jobs and sample schedules as well that you can use.

Future plans

I am working on a stored procedure for collecting Memory Usage Data. I also plan to write a set of stored procedures for analyzing and evaluating the data collected (Analyzing Part of the Solution).

Please report me every bug you may find so I can fix as soos as I can.

Ideas are always welcome!

Keep checking the updates on SQLApprentice.net

If you have any remarks, questions or you have any ideas, please feel free to contact me, leave a comment or drop me a mail /robertATsqlapprenticeDOTnet/.

 

Resources:

BaselineCollectorSolution_V1.2.zip
Total article views: 11424 | Views in the last 30 days: 13
 
Related Articles
ARTICLE

Baseline Collector Solution V2

New version of the Free Baseline Collector Solution released. Do you collect baseline data? If not, ...

SCRIPT

Collecting SQL Server 2000/2005/2008 baseline

Script to get Server Info Collecting SQL Server 2000/2005/2008 baseline inventory

ARTICLE

How to Install and Configure Baseline Collector Solution V2

In this article I go through the installation and configuration processes of my Baseline Collector S...

ARTICLE

Capturing Baselines on SQL Server: Wait Statistics

By capturing baseline data, a well-prepared DBA should get a good idea of what potential issues they...

ARTICLE

What Counts for a DBA: Baselines

It is very difficult to pinpoint what ails a server, just by looking at a single snapshot of the dat...

Tags
baseline    
 
Contribute