Stairway to SQL Server Agent

Stairway to SQL Server Agent - Level 3: Agent Alerts and Operators

,

As we’ve seen in the previous levels of this Stairway, SQL Server Agent jobs are made up of a series of job steps, with each step being a distinct type of work to be performed.  In addition to the work performed in each step, you may want to have messages sent to the DBA (or others) about the success and/or failure of a particular job.   You might also be interested in having messages sent when the system is experiencing trouble, when certain performance issues arise, or when certain thresholds are met on the system.  SQL Server Agent provides the capability of creating alerts, which can cause messages to be sent to designated operators as notifications, to assist you with these tasks.

What is a SQL Server Agent Alert?

A SQL Server Agent alert is an automated response to some condition on your SQL Server system.  An alert may be triggered by the following conditions:

  • A SQL Server system message is generated
  • A system error with a minimum severity level is detected
  • A specified threshold is reached in a SQL Server performance counter
  • A WMI query result is met

Once an alert is triggered, a response, as defined in the alert, will occur.  The possible responses are to either run a particular SQL Server Agent job, or to notify one or more operators. You also have the option of specifying that the error text that triggered the alert be included in any notifications that are sent.

What is a SQL Server Agent Operator?

A SQL Server Agent Operator can be thought of as simply an email address.  There are other options, such as a net send address or a “pager” email name, but in fact the pager options are simply sent as email.

Note:

“Net Send” is still an option for SQL Server Alerts, but this functionality shouldn’t be used in practice and has been officially deprecated from the product, meaning that it will be removed in a future version of SQL Server. The only practical option for alerting is through email notifications.

When an operator is defined, you can then use that operator to deliver emails (or text messages sent via email, for example) to a defined email alias.  That email alias will likely be a group of one or more DBAs who are on standby looking out for system problems.  For critical alerts, you may want to use an escalation operator that may forward text messages to a cell phone, for example.

Creating an operator

To create an operator, open up SQL Server Management Studio (SSMS) and navigate to the Operators folder under SQL Server Agent.  Right click on Operators, and select the option for “New Operator”.  You will see something like Figure 1.

Figure 1 – New Operator Dialog

As you can see in Figure 1, I’ve specified a name, as well as an email address, for the operator.  Ensure that the “Enabled” checkbox is checked, otherwise no alerts to this operator will actually happen.

When you select the Notifications page, you will see a list of the Alerts that are assigned to this operator.  Given that this is a new operator, the list is empty as shown in Figure 2. Click OK, and the operator has been created.

Figure 2 – The Notifications page for a new Operator

On an operational system, one would expect a minimum of two operators to exist – one for urgent system notifications (the Severe_Alerts operator that we just defined, for example), and one for each division of labor you might have on SQL Server Agent jobs and/or ordinary errors.

Operators can, of course, be defined in Transact-SQL as well.  You would use the sp_add_operator system stored procedure to create an operator.  For help with the syntax, see the books online at http://msdn.microsoft.com/en-us/library/ms186747.aspx.

Note

A common question about operators is how you can send alerts to multiple people.  The way you do this is to use mail distribution lists through your email system, as SQL Server Agent will only send alerts to a single operator.  You would therefore define the operator in SQL Server Agent as a distribution list, and everyone on the list will then receive the associated alerts.

Fail-Safe Operators

A fail-safe operator is one that is defined to be used if all other operator notifications fail.  There are several possible reasons for such failure (for example, a failure in your mail system, which we’ll look at in the next article), but in any event having a fail-safe operator is a good idea.  You define the fail-safe operator once you have created one or more Operators.  The fail-safe operator is simply an operator that you specify that will be contacted as a last resort.

You define a fail-safe operator on the SQL Server Agent properties dialog (right-click on SQL Server Agent, select Properties), on the Alert System page.  Check the box to enable the fail-safe operator, and then select from the list of operators you have defined on your system as shown in Figure 3.  Click OK when you have finished your selection.  Note that the rest of the options on this page will be covered in my next article on database mail.

Figure 3: Configuring a Fail-safe operator

Creating Alerts

So now we’re ready to actually create the alert, which as we saw earlier, is simply an automated response to some condition on your SQL Server system.  To define a new alert, navigate to the SQL Server Agent folder in SSMS, then to the Alertsfolder, right-click and select “New Alert…” to launch the new Alert dialog.  You will see a dialog like Figure 4.  For the first alert on the system, we will create an alert to notify the DBA of any system-level error (severity 19 or higher).

Figure 4: Creating a new alert

SQL Server Event Alert

Just like operators, the alert must be enabled to actually have anything happen with the alert.  Note that in this example, the “Type” of alert is a “SQL Server event alert”.  You can also select from “SQL Server performance condition alert”, which provides visibility to the SQL Server-specific performance counters for the instance of SQL Server associated with this instance. We’ll see a performance condition alert in the next section.  Finally, there is a “WMI event alert” that will allow you to write Windows Management Instrumentation (WMI) queries and respond to them with alerts.  You can learn more about WMI at http://msdn.microsoft.com/en-us/library/aa394582(v=VS.85).aspx.

The severity values shown in the dropdown in the New Alert dialog is simply a list of all the possible error severity levels available to SQL Server (see the RAISERROR command documentation for more details at http://msdn.microsoft.com/en-us/library/ms178592.aspx. ) Those errors at level 19 or above are severe errors that may require DBA intervention to investigate why a severe error has occurred.

Once you have selected the options as shown in Figure 4, click on the Response page to look at what actions can be taken for this alert (as shown in Figure 5).  You will see that you can execute a SQL Server Agent job, or notify an operator (for example, the operator we defined above).  In this example, we’ve selected to send an email to our Severe_Alerts operator when this alert fires.

Figure 5:  New Alert Responses

After checking the preferred operator, you can switch to the Options page for a few interesting choices to make about this alert , as shown in Figure 6.  Select the option to include the alert text if you’d like to know why the alert is being sent to you (otherwise understanding the alert will likely be more difficult).  This page also allows you to specify an additional notification message which can be useful  to remind you at 3am why you’re getting woken up with this message, or to help as a search string later on when digging into the emails that came from SQL Server.  If this is a common alert condition, you may choose to use the default notification message exclusively without additional comments of your own.

The delay options specify whether you should get a notification every single time this alert is triggered, or if one every “x” number of minutes or seconds is good enough.  In this example, a 5 minute delay is set, so that your email inbox won’t be flooded with a series of messages, all indicating the same problem.

Figure 6 New Alert Options

Click OK when you’ve completed your choices for this dialog and you’ve created a new alert.

You can, of course, create alerts via the sp_add_alertsystem stored procedure.  The documentation for the stored proc is available at http://msdn.microsoft.com/en-us/library/ms189531.aspx, or simply script out the one you just created from SSMS, as illustrated in  Figure 7 .

Figure 7:  Scripting an Alert

Alert exemptions

It’s interesting to note that you can specify that certain error messages should never fire an alert. To define such errors, you must manually update the registry on the server machine. The registry location is in the instance-specific registry for your instance. Under the SQL Server Agent registry folder, you’ll find a key named “NonAlertableErrors”, as shown in Figure 8. By default, two error numbers will be there: 1204 (Can’t obtain a lock) and 4002 (login failed).  If you really do want an alert on either of these error conditions, you will need to edit this registry key to remove the number from the list.  Conversely, if you want to add additional error numbers that should never fire alerts, you can add those error numbers to this list.

Of course, all the usual warnings about being extra-careful when editing your registry apply.

Figure 8: Registry Editor to configure NonAlertableErrors

Performance Condition Alerts

The second type of alert is a Performance Condition alert.    You can set up many, many kinds of alerts based on performance counters.  Unfortunately, you are restricted to just those counters that are exposed by SQL Server, but that still allows you to add quite a bit of automatic monitoring to your systems.  For example, in the screenshot below (Figure 9), you can see an alert that will warn you when the transaction log for the AdventureWorks database is more than 90% full.  Once you have selected the options as shown in Figure 4, click on the Response page to look at what actions can be taken for this alert (as shown in Figure 5).  You will see that you can execute a SQL Server Agent job (such as the earlier reference to growing the log space, or starting a backup), or notify an operator (for example, the operator we defined above).  In this example, we’ve selected to send an email to our Severe_Alerts operator when this alert fires.

Figure 9:  Performance Condition Alert

Beyond the Alert system basics

In addition to the high severity-level alert that was created in this article, you should investigate other aspects of alerts.  Setting up alerts on performance counters and WMI queries are extremely powerful capabilities.  You are limited, however to just the SQL Server-specific performance counters, and not system counters such as CPU.  With that noted, however, you can still set up alerts when SQL Server logins reach a certain threshold, or upon certain SQL Server memory conditions, or when I/O thresholds are exceeded.

Additionally, the ability to execute a job when an alert occurs means that you can fix many common error conditions through automation rather than having human intervention every time a common error condition occurs.  There are no specific requirements for the job that is executed; any SQL Server Agent job may be called.

What’s Next

At least one operator with basic alerting should be configured on all SQL Server systems.  This will ensure that for the most severe errors, some sort of notification is sent to your DBA team.  Of course, having email alerts depends on having SQL Server able to send email alerts, so our next article will focus on how to configure Database Mail.

This article is part of the parent stairway Stairway to SQL Server Agent

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating