SQLServerCentral Article

Nightly Failed Jobs Report

,

Do you get tired of reviewing each SQL Agent notifications nightly to determine which SQL Agent jobs failed? Is there a significant number of SQL Agent job notifications that it takes a while to review each? Here is an alternative that allows you to have a single email report of all job failures.

Why We Created A Single Job Failure Report

Over the past few years we have implemented a number of SQL Agent jobs in SQL Server. Each job uses the SQL Agent notification process to communicate success or failures to the DBAs. This communication is all done via email. Due to the number of jobs we have, we get lots of emails reporting the successes and failures of our nightly jobs. Every morning the time to review each of the email notifications to determine success or failure of each SQL Agent jobs was time consuming. Also it was easy to get in a habit of quickly deleting the emails without really noticing whether the jobs succeeded or failed. For this reason we decide to create a process to produce a single job failure report.

What We Did

We decide since the job/step failure information was stored in the MSDB database that we would write a stored procedure called “usp_failed_jobs_report (see “Listing of SP” below) to produce a single report. This stored procedure performs some simple SQL to gather information from the sysjobhistory and sysjobs tables, formats the data into a report, and then emails the report to the our DBAs.

This SP code is run every morning Monday - Friday at 6 am via a SQL Agent job. The SP determines which jobs and job steps have failed since the last scheduled running of this SP. For each failed jobs/job step this SP produces a single line in the report. The line in the report shows the name of the job and step that failed, plus the date and time of the failure.

Example of Report

Here is an example on the a report produced by this SP.

The following jobs/steps failed since Aug 23 2002  6:00AM                                            
job                                         step_name                         failed datetime       
------------------------------------------- --------------------------------- -------------------   
ADRN1303 TRIGGER                            TRIGGER FILE EXIST                Aug 25 2002 12:15AM   
LOAD EMPLOYEE TABLE                         LOAD TABLE                        Aug 25 2002 12:58AM
LOAS EMPLOYEE TABLE                         (Job outcome)                     Aug 25 2002 12:58AM     
ADDS load for PROD1 New                     FTP VALIDATION 1                  Aug 25 2002  2:16AM    
ADDS load for PROD1 New                     FTP VALIDATION 2                  Aug 25 2002  2:16AM 

This report shows that “ADRN1303 TRIGGER” job had a step failure, “LOAD EMPLOYEE TABLE” had a step and job failure (the “(job outcome)” step_name indicates a job failure), and “ADDS load for PROD1 New” had two different step failures.

Conclusion

By having a single report, the DBA's in our shop can now review a single email to determine which jobs have failed since the last business day. Having this single report allows a quick method for the DBA’s to identify all the jobs and step failures for the past day. By reviewing a single report the DBA’s don’t have to review each individual SQL Agent notification email to determine which jobs failed.

Listing of SP

CREATE procedure usp_failed_jobs_report as
-- Written by: Greg Larsen
-- Company: Department of Health, Washington State
-- Date: January 3, 2002
-- Description:  This SQL Code reports job/step failures based on a data and time range.  The
--               report built is emailed to the DBA distribution list.
--
-- Modified 04/12/2002 - Greg Larsen - Modified to support Long running jobs that cross reporting
--                                     periods
declare @RPT_BEGIN_DATE datetime
declare @NUMBER_OF_DAYS int
-- Set the number of days to go back to calculate the report begin date
set @NUMBER_OF_DAYS = -1
-- If the current date is Monday, then have the report start on Friday.
if datepart(dw,getdate()) = 2
  set @NUMBER_OF_DAYS = -3
-- Get the report begin date and time
set @RPT_BEGIN_DATE = dateadd(day,@NUMBER_OF_DAYS,getdate())
-- Get todays date in YYMMDD format
-- Create temporary table to hold report
create table ##temp_text (
email_text char(100))
-- Generate report heading and column headers
insert into ##temp_text values('The following jobs/steps failed since ' +
                               cast(@RPT_BEGIN_DATE as char(20)) )
insert into ##temp_text values ('job                                         step_name                         failed datetime    ')
insert into ##temp_text values ('------------------------------------------- --------------------------------- -------------------')
-- Generate report detail for failed jobs/steps
insert into ##temp_text (email_text)
 select substring(j.name,1,43)+
        substring('                                           ',
        len(j.name),43) + substring(jh.step_name,1,33) +
        substring('                                 ',
        len(jh.step_name),33) +
        -- Calculate fail datetime
        -- Add Run Duration Seconds
        cast(dateadd(ss,
        cast(substring(cast(run_duration + 1000000 as char(7)),6,2) as int),
        -- Add Run Duration Minutes
        dateadd(mi,
        cast(substring(cast(run_duration + 1000000 as char(7)),4,2) as int),
        -- Add Run Duration Hours
        dateadd(hh,
        cast(substring(cast(run_duration + 1000000 as char(7)),2,2) as int),
        -- Add Start Time Seconds
        dateadd(ss,
        cast(substring(cast(run_time + 1000000 as char(7)),6,2) as int),
        -- Add Start Time Minutes
        dateadd(mi,
        cast(substring(cast(run_time + 1000000 as char(7)),4,2) as int),
        -- Add Start Time Hours
        dateadd(hh,
        cast(substring(cast(run_time + 1000000 as char(7)),2,2) as int),
        convert(datetime,cast (run_date as char(8))))
           ))))) as char(19))
   from msdb..sysjobhistory jh join msdb..sysjobs j on jh.job_id=j.job_id
   where   (getdate() >
               -- Calculate fail datetime
               -- Add Run Duration Seconds
               dateadd(ss,
               cast(substring(cast(run_duration + 1000000 as char(7)),6,2) as int),
               -- Add Run Duration Minutes
               dateadd(mi,
               cast(substring(cast(run_duration + 1000000 as char(7)),4,2) as int),
               -- Add Run Duration Hours
               dateadd(hh,
               cast(substring(cast(run_duration + 1000000 as char(7)),2,2) as int),
               -- Add Start Time Seconds
               dateadd(ss,
               cast(substring(cast(run_time + 1000000 as char(7)),6,2) as int),
               -- Add Start Time Minutes
               dateadd(mi,
               cast(substring(cast(run_time + 1000000 as char(7)),4,2) as int),
               -- Add Start Time Hours
               dateadd(hh,
               cast(substring(cast(run_time + 1000000 as char(7)),2,2) as int),
               convert(datetime,cast (run_date as char(8))))
               ))))))
and  (@RPT_BEGIN_DATE < -- Calculate fail datetime
               -- Add Run Duration Seconds
               dateadd(ss,
               cast(substring(cast(run_duration + 1000000 as char(7)),6,2) as int),
               -- Add Run Duration Minutes
               dateadd(mi,
               cast(substring(cast(run_duration + 1000000 as char(7)),4,2) as int),
               -- Add Run Duration Hours
               dateadd(hh,
               cast(substring(cast(run_duration + 1000000 as char(7)),2,2) as int),
               -- Add Start Time Seconds
               dateadd(ss,
               cast(substring(cast(run_time + 1000000 as char(7)),6,2) as int),
               -- Add Start Time Minutes
               dateadd(mi,
               cast(substring(cast(run_time + 1000000 as char(7)),4,2) as int),
               -- Add Start Time Hours
               dateadd(hh,
               cast(substring(cast(run_time + 1000000 as char(7)),2,2) as int),
               convert(datetime,cast (run_date as char(8))))
               ))))))
      
      and jh.run_status = 0
-- Email report to DBA distribution list
exec master.dbo.xp_sendmail @recipients='Greg.Larsen@sqlservercentral.com',
              @subject='Check for Failed Jobs - Contains jobs/steps that have failed.',
              @query='select * from ##temp_text' , @no_header='true', @width=150
-- Drop temporary table
drop table ##temp_text
 
GO
SET QUOTED_IDENTIFIER OFF
GO
SET ANSI_NULLS ON
GO

Rate

5 (4)

You rated this post out of 5. Change rating

Share

Share

Rate

5 (4)

You rated this post out of 5. Change rating