SQLServerCentral Article

SCOME - Part 3: The Failed Jobs Report

,

A recap on things so far

This is article number 3 in this series where we are building a suite of web

reports that will display the statuses of all our SQL Servers. Web reports that

will display Failed jobs, Unrun backups, Test restores, Disk space etc. We will

use T-SQL and ASP.Net to achieve this. In the last article we set up our linked

servers objects to all the SQL Servers on the network that will be monitored.

We also created and populated a table, [Server_SQL_Details], that holds data

regarding these servers.

The Failed Jobs report

We're ready to create our first report, the Failed Jobs report. The articles

will progressively get more complex as we go from article to article, this is

one of the more straightforward reports to setup. The Failed Jobs report will

display all (if any) failed jobs on our 15 gazzillion SQL Servers, and allow

us to further drill down to each server's corresponding log entries (be it job

history errors, maint plan or SQL Log errors), to locate the cause of the failures,

without having to connect to each via Management Studio. We would rather not

have to log on to every server via EM/SSMS to check them, way way too much clicking

involved, not to mention a separate job would need to be created for each remote

server. In this article we will setup the necessary tables and procedures to

do this using the SCOME technique described in article number one.

The Report Fields

The fields returned will be the "Name of the server" where the

job has failed, "Name of the job", the "Job description",

when the "Last attempted run" was and the "Job id". There

will be 3 buttons for each failed job that is returned and they allow us

to extract data about the failed jobs from various SQL Server sources.

There is a text box that allows you to choose which SQL Log (cycle) to choose

in the likely event that the instance has more than one. The date that this

data on the failed jobs was last gathered is also displayed. Once you get the

hang of all this though (and is the point of these articles), you'll be able

to add/remove as many fields as is provided via the DMVs and you will be able

to select ones that you feel appropriate to your way of working.

The Failed Job Table

We first need to create a table that will hold data regarding the failed jobs.

The web front end will read the data from this table. We will create this table

in the database we created in a previous article.

 USE
 [DBA_Admin]
 GO
/******
 Part 3a: The Failed Jobs Report
 Object Creation: Table Failed_Jobs
 Purpose: To create the table Failed_Jobs
 Date: 01/02/2009
 Author: Drew Salem - www.thebuddingdba.com
 For: SQLServerCentral.com
 ******/CREATE TABLE [dbo].[Failed_Jobs]
 (
 [originating_server] [varchar](255) NULL,
 [job_name] [varchar](255) NULL,
 [job_description] [varchar](1023) NULL,
 [last_outcome_message] [varchar](255) NULL,
 [last_run_date] [varchar](63) NULL,
 [job_id] [varchar](255) NULL
 )

SQL Server 2000 vs 2005 +

The SQL Servers that needed monitoring when

I first set this up were versions 2000 and 2005. As many of the system tables

have been replaced by views from 2005 onwards, our monitoring scripts will

need to run code appropriate to the version. That's why when we created our

[Server_SQL_Details] table in an earlier article, we included a column for the

version. We could have just used the ('productversion') argument of SERVERPROPERTY

to obtain this info, but we'll be accessing this info throughout several reports

so why not keep things simple. The script does a quick check against this table

for the version and runs one of two pieces of code dependant on the version

of SQL Server you are collecting data from.

For SQL Server 2000, we will extract data from the

msdb tables [sysjobservers] and [sysjobs].

For 2005, the [sysjobservers],[sysjobs]and

the view [sysoriginatingservers_view].

The SCOME technique requires no geek. It's fairly straightforward. As we have

previously created Linked Server objects for all of our remote SQL Servers,

we can dynamically use T-SQL to gather the data from all remote servers. One

stored procedure, "usp_GetFailedJob", retrieves data about failed

jobs on a single server by excepting this server's instance name as the input

parameter required by this stored procedure. A second proc, "usp_Iterate_FailedJobs"

retrieves the server name of each of our remote SQL Servers from the [Server_SQL_Details]

table and provides it as the arguement required by usp_GetFailedJob, thus firing

usp_GetFailedJob on all servers and populating the results into the Failed_Jobs

table.

The usp_GetFailedJob procedure

  USE [DBA_Admin]
  GO
  /******
    Part 3a: The Failed Jobs Report
    Object Creation: The usp_GetFailedJob stored procedure
    Purpose: To retrieve failed jobs data from a single server
    Date: 08/02/2009
    Author: Drew Salem - www.thebuddingdba.com
    For: SQLServerCentral.com
  ******/  CREATE PROC [dbo].[usp_GetFailedJob] 
    @servername SYSNAME 
  AS 
  SET NOCOUNT ON 
  DECLARE @sql1 VARCHAR (8000)
  DECLARE @version VARCHAR (50)
  --First check which version of SQL Server is running.
    SELECT @version = serverversion 
    FROM Server_SQL_Details
    WHERE servername = @servername
  --If it's 2000 then execute this code
  IF @version = '8'
   BEGIN
    SELECT @sql1 = '
       Select j.originating_server, 
              j.name, j.description, jh.last_outcome_message, substring(space(1), 
              33,33) + 
  -- Calculate and format fail datetime
    -- Add Run Duration Seconds
    cast(
    -- Add Start Time Seconds
    dateadd(ss, cast(substring(cast(last_run_time + 1000000 as char(7)),6,2) as int),
  -- Add Start Time Minutes 
    dateadd(mi, cast(substring(cast(last_run_time + 1000000 as char(7)),4,2) as int),
  -- Add Start Time Hours
    dateadd(hh, cast(substring(cast(last_run_time + 1000000 as char(7)),2,2) as int),
  convert(datetime,cast (last_run_date as char(8)))))) as char(19)) As Last_Run_Date, j.job_id
  FROM [' + @servername + '].msdb.dbo.sysjobservers 
    jh join [' + @servername + '].msdb.dbo.sysjobs j ON jh.job_id=j.job_id
    where last_run_outcome <> 1'
  --Insert into a temp table before it gets its knickers in a twist 
    CREATE TABLE #t2 (
    originating_server VARCHAR(255), 
    job_name VARCHAR(255),
    job_desc VARCHAR(1023),
    last_outcome_message VARCHAR(255), 
    last_run_date VARCHAR(63),
    job_id VARCHAR(255))
  INSERT INTO #t2 EXEC(@sql1)
  --And insert the relevant info into our Failed Jobs table
    INSERT INTO Failed_Jobs
    (originating_Server, job_name, job_description, last_outcome_message, last_run_date, job_id)
    SELECT originating_server, job_name, job_desc, last_outcome_message, last_run_date, job_id
    FROM #t2
  DROP TABLE #t2
  END 
  ELSE
  --If it's 2005 then execute this code
    IF @version = '9'
    BEGIN 
     SELECT @sql1 = '
        Select id.originating_server, 
        j.name, j.description, jh.last_outcome_message, substring(space(1), 
        33,33) + 
        -- Calculate fail datetime
        -- Add Run Duration Seconds
       cast(
       -- Add Start Time Seconds
        dateadd(ss, cast(substring(cast(last_run_time + 1000000 as char(7)),6,2) as int),
       -- Add Start Time Minutes 
        dateadd(mi, cast(substring(cast(last_run_time + 1000000 as char(7)),4,2) as int),
       -- Add Start Time Hours
        dateadd(hh, cast(substring(cast(last_run_time + 1000000 as char(7)),2,2) as int),
        convert(datetime,cast (last_run_date as char(8)))))) as char(19)) As Last_Run_Date, j.job_id
      FROM [' + @servername + '].msdb.dbo.sysjobservers 
       jh join [' + @servername + '].msdb.dbo.sysjobs j ON jh.job_id=j.job_id 
          join [' + @servername + '].msdb.dbo.sysoriginatingservers_view id 
            ON id.originating_server_id=j.originating_server_id
      where last_run_outcome <> 1'
      CREATE TABLE #t3 (
       originating_server VARCHAR(255), 
       job_name VARCHAR (255),
       job_desc VARCHAR (1023),
       last_outcome_message VARCHAR (255), 
       last_run_date VARCHAR(63),
       job_id VARCHAR (255))
      INSERT INTO #t3 EXEC(@sql1)
     INSERT INTO Failed_Jobs 
       (originating_Server, job_name, job_description, last_outcome_message, last_run_date, job_id)
     SELECT originating_server, job_name, job_desc, last_outcome_message, last_run_date, job_id
      FROM #t3
     DROP TABLE #t3
    END

Now if you have setup a linked server, say Payroll_Server,

why don't you deliberately fail any job on it (in the advanced properties of

any existing jobstep you can temporarily set it fail on Success). Then execute

the above proc using Payroll_Server as the input parameter. Details regarding

the failed job will be entered into our Failed_Jobs table.

We don't want to create a job for each linked server though, so now we create

the usp_Iterate_FailedJobs proc that will iterate through each row in the [Server_SQL_Details]

table and use each server name as an input parameter for usp_GetFailedJob.

The usp_Iterate_FailedJobs procedure

    USE [DBA_Admin]
    GO
    /******
    Part 3a: The Failed Jobs Report
    Object Creation: The usp_Iterate_FailedJobs stored procedure
    Purpose: Used to iterate through the list of linked servers in the Server_SQL_Details 
    table and 
    and pass their names as a parameter to the usp_GetFailedJob proc.
    Date: 08/02/2009
    Author: Drew Salem - www.thebuddingdba.com
    For: SQLServerCentral.com
    ******/  
    CREATE PROC [dbo].[usp_Iterate_FailedJobs]
  AS
    DECLARE @ServerName VARCHAR (255)
    DECLARE @ServerID INT
    --I used a memory table, you could also use a cursor
    --We use identity to guarantee that the ServerID will start at 1
    
    DECLARE @MemoryTable TABLE (
    MemoryTableServerID INT IDENTITY (1,1) NOT NULL,
    ServerName VARCHAR (255) NOT NULL)
    INSERT INTO @MemoryTable
    SELECT 
    ServerName 
    FROM Server_SQL_Details
    DECLARE @loopCounter INT
    DECLARE @numberOfRows INT
    --Empty the current contents of the table
    DELETE Failed_Jobs
  
    SET @numberOfRows = ISNULL((SELECT COUNT (*) FROM @MemoryTable),0)
  
    --This is set to 1 so that it matches the FIRST ServerID in @MemoryTable
    SET @loopCounter = 1
    WHILE 
     --Whilst there is at least one entry in the Server_SQL_Details table
    @numberOfRows > 0 
    AND
    -- and we have not processed all the rows
    @loopCounter <= @numberOfRows
    
    BEGIN
    --Select the name of the Server from the MemoryTable, and assign it to the variable @Servername
    SELECT @ServerName = ServerName
    FROM @MemoryTable
    
    --making sure that the number of the row in the MemoryTable is equal 
    -- to the number of the loopCounter
    WHERE MemoryTableServerID = @loopCounter
    --Execute the usp_GetErrorLog stored procedure for each row
    EXEC dbo.usp_GetFailedJob @servername
    --Increment the counter and go back up until the value of the counter is the same
    --as the value of the total number of rows in @MemoryTable
    SET @loopcounter = @loopCounter + 1
    
    END
    

Testing that it works

If you've set up more than one linked server

then now's the time to test your Failed Reports monitoring script! Execute usp_Iterate_FailedJobs

and it will go through each row in the [Server_SQL_Details] table, pick out

the servername, determine the version and extract data regarding the failed

jobs on that server. It will then insert it into the Failed_Jobs table. There

you go, you now know all jobs that have failed on all your SQL Servers! Now

set up a single schedule job on your central server, SQL_Admin, to execute usp_Iterate_FailedJobs.

Lovely Jubley!

In the next article we'll create four more stored

procedures needed for the Failed Jobs report. One to retrieve data regarding

the failed job from the job history table for the server in question, one to

retrieve data regarding the failed job from the maint plan table for the server

in question, one to retrieve data regarding the failed job from the SQL Server

logs for the server in question and one displaying a summary of the Failed_Jobs

table for our web front end.

See you then,

Drew

Resources

Rate

4.67 (12)

You rated this post out of 5. Change rating

Share

Share

Rate

4.67 (12)

You rated this post out of 5. Change rating