A recap on things so far
This is article number 3 in this series where we are building a suite of web
reports that will display the statuses of all our SQL Servers. Web reports that
will display Failed jobs, Unrun backups, Test restores, Disk space etc. We will
use T-SQL and ASP.Net to achieve this. In the last article we set up our linked
servers objects to all the SQL Servers on the network that will be monitored.
We also created and populated a table, [Server_SQL_Details], that holds data
regarding these servers.
The Failed Jobs report
We're ready to create our first report, the Failed Jobs report. The articles
will progressively get more complex as we go from article to article, this is
one of the more straightforward reports to setup. The Failed Jobs report will
display all (if any) failed jobs on our 15 gazzillion SQL Servers, and allow
us to further drill down to each server's corresponding log entries (be it job
history errors, maint plan or SQL Log errors), to locate the cause of the failures,
without having to connect to each via Management Studio. We would rather not
have to log on to every server via EM/SSMS to check them, way way too much clicking
involved, not to mention a separate job would need to be created for each remote
server. In this article we will setup the necessary tables and procedures to
do this using the SCOME technique described in article number one.
The Report Fields
The fields returned will be the "Name of the server" where the
job has failed, "Name of the job", the "Job description",
when the "Last attempted run" was and the "Job id". There
will be 3 buttons for each failed job that is returned and they allow us
to extract data about the failed jobs from various SQL Server sources.
There is a text box that allows you to choose which SQL Log (cycle) to choose
in the likely event that the instance has more than one. The date that this
data on the failed jobs was last gathered is also displayed. Once you get the
hang of all this though (and is the point of these articles), you'll be able
to add/remove as many fields as is provided via the DMVs and you will be able
to select ones that you feel appropriate to your way of working.
The Failed Job Table
We first need to create a table that will hold data regarding the failed jobs.
The web front end will read the data from this table. We will create this table
in the database we created in a previous article.
USE [DBA_Admin] GO /****** Part 3a: The Failed Jobs Report Object Creation: Table Failed_Jobs Purpose: To create the table Failed_Jobs Date: 01/02/2009 Author: Drew Salem - www.thebuddingdba.com For: SQLServerCentral.com ******/CREATE TABLE [dbo].[Failed_Jobs] ( [originating_server] [varchar](255) NULL, [job_name] [varchar](255) NULL, [job_description] [varchar](1023) NULL, [last_outcome_message] [varchar](255) NULL, [last_run_date] [varchar](63) NULL, [job_id] [varchar](255) NULL )
SQL Server 2000 vs 2005 +
The SQL Servers that needed monitoring when
I first set this up were versions 2000 and 2005. As many of the system tables
have been replaced by views from 2005 onwards, our monitoring scripts will
need to run code appropriate to the version. That's why when we created our
[Server_SQL_Details] table in an earlier article, we included a column for the
version. We could have just used the ('productversion') argument of SERVERPROPERTY
to obtain this info, but we'll be accessing this info throughout several reports
so why not keep things simple. The script does a quick check against this table
for the version and runs one of two pieces of code dependant on the version
of SQL Server you are collecting data from.
For SQL Server 2000, we will extract data from the
msdb tables [sysjobservers] and [sysjobs].
For 2005, the [sysjobservers],[sysjobs]and
the view [sysoriginatingservers_view].
The SCOME technique requires no geek. It's fairly straightforward. As we have
previously created Linked Server objects for all of our remote SQL Servers,
we can dynamically use T-SQL to gather the data from all remote servers. One
stored procedure, "usp_GetFailedJob", retrieves data about failed
jobs on a single server by excepting this server's instance name as the input
parameter required by this stored procedure. A second proc, "usp_Iterate_FailedJobs"
retrieves the server name of each of our remote SQL Servers from the [Server_SQL_Details]
table and provides it as the arguement required by usp_GetFailedJob, thus firing
usp_GetFailedJob on all servers and populating the results into the Failed_Jobs
table.
The usp_GetFailedJob procedure
USE [DBA_Admin] GO /****** Part 3a: The Failed Jobs Report Object Creation: The usp_GetFailedJob stored procedure Purpose: To retrieve failed jobs data from a single server Date: 08/02/2009 Author: Drew Salem - www.thebuddingdba.com For: SQLServerCentral.com ******/ CREATE PROC [dbo].[usp_GetFailedJob] @servername SYSNAME AS SET NOCOUNT ON DECLARE @sql1 VARCHAR (8000) DECLARE @version VARCHAR (50) --First check which version of SQL Server is running. SELECT @version = serverversion FROM Server_SQL_Details WHERE servername = @servername --If it's 2000 then execute this code IF @version = '8' BEGIN SELECT @sql1 = ' Select j.originating_server, j.name, j.description, jh.last_outcome_message, substring(space(1), 33,33) + -- Calculate and format fail datetime -- Add Run Duration Seconds cast( -- Add Start Time Seconds dateadd(ss, cast(substring(cast(last_run_time + 1000000 as char(7)),6,2) as int), -- Add Start Time Minutes dateadd(mi, cast(substring(cast(last_run_time + 1000000 as char(7)),4,2) as int), -- Add Start Time Hours dateadd(hh, cast(substring(cast(last_run_time + 1000000 as char(7)),2,2) as int), convert(datetime,cast (last_run_date as char(8)))))) as char(19)) As Last_Run_Date, j.job_id FROM [' + @servername + '].msdb.dbo.sysjobservers jh join [' + @servername + '].msdb.dbo.sysjobs j ON jh.job_id=j.job_id where last_run_outcome <> 1' --Insert into a temp table before it gets its knickers in a twist CREATE TABLE #t2 ( originating_server VARCHAR(255), job_name VARCHAR(255), job_desc VARCHAR(1023), last_outcome_message VARCHAR(255), last_run_date VARCHAR(63), job_id VARCHAR(255)) INSERT INTO #t2 EXEC(@sql1) --And insert the relevant info into our Failed Jobs table INSERT INTO Failed_Jobs (originating_Server, job_name, job_description, last_outcome_message, last_run_date, job_id) SELECT originating_server, job_name, job_desc, last_outcome_message, last_run_date, job_id FROM #t2 DROP TABLE #t2 END ELSE --If it's 2005 then execute this code IF @version = '9' BEGIN SELECT @sql1 = ' Select id.originating_server, j.name, j.description, jh.last_outcome_message, substring(space(1), 33,33) + -- Calculate fail datetime -- Add Run Duration Seconds cast( -- Add Start Time Seconds dateadd(ss, cast(substring(cast(last_run_time + 1000000 as char(7)),6,2) as int), -- Add Start Time Minutes dateadd(mi, cast(substring(cast(last_run_time + 1000000 as char(7)),4,2) as int), -- Add Start Time Hours dateadd(hh, cast(substring(cast(last_run_time + 1000000 as char(7)),2,2) as int), convert(datetime,cast (last_run_date as char(8)))))) as char(19)) As Last_Run_Date, j.job_id FROM [' + @servername + '].msdb.dbo.sysjobservers jh join [' + @servername + '].msdb.dbo.sysjobs j ON jh.job_id=j.job_id join [' + @servername + '].msdb.dbo.sysoriginatingservers_view id ON id.originating_server_id=j.originating_server_id where last_run_outcome <> 1' CREATE TABLE #t3 ( originating_server VARCHAR(255), job_name VARCHAR (255), job_desc VARCHAR (1023), last_outcome_message VARCHAR (255), last_run_date VARCHAR(63), job_id VARCHAR (255)) INSERT INTO #t3 EXEC(@sql1) INSERT INTO Failed_Jobs (originating_Server, job_name, job_description, last_outcome_message, last_run_date, job_id) SELECT originating_server, job_name, job_desc, last_outcome_message, last_run_date, job_id FROM #t3 DROP TABLE #t3 END
Now if you have setup a linked server, say Payroll_Server,
why don't you deliberately fail any job on it (in the advanced properties of
any existing jobstep you can temporarily set it fail on Success). Then execute
the above proc using Payroll_Server as the input parameter. Details regarding
the failed job will be entered into our Failed_Jobs table.
We don't want to create a job for each linked server though, so now we create
the usp_Iterate_FailedJobs proc that will iterate through each row in the [Server_SQL_Details]
table and use each server name as an input parameter for usp_GetFailedJob.
The usp_Iterate_FailedJobs procedure
USE [DBA_Admin] GO /****** Part 3a: The Failed Jobs Report Object Creation: The usp_Iterate_FailedJobs stored procedure Purpose: Used to iterate through the list of linked servers in the Server_SQL_Details table and and pass their names as a parameter to the usp_GetFailedJob proc. Date: 08/02/2009 Author: Drew Salem - www.thebuddingdba.com For: SQLServerCentral.com ******/ CREATE PROC [dbo].[usp_Iterate_FailedJobs] AS DECLARE @ServerName VARCHAR (255) DECLARE @ServerID INT --I used a memory table, you could also use a cursor --We use identity to guarantee that the ServerID will start at 1 DECLARE @MemoryTable TABLE ( MemoryTableServerID INT IDENTITY (1,1) NOT NULL, ServerName VARCHAR (255) NOT NULL) INSERT INTO @MemoryTable SELECT ServerName FROM Server_SQL_Details DECLARE @loopCounter INT DECLARE @numberOfRows INT --Empty the current contents of the table DELETE Failed_Jobs SET @numberOfRows = ISNULL((SELECT COUNT (*) FROM @MemoryTable),0) --This is set to 1 so that it matches the FIRST ServerID in @MemoryTable SET @loopCounter = 1 WHILE --Whilst there is at least one entry in the Server_SQL_Details table @numberOfRows > 0 AND -- and we have not processed all the rows @loopCounter <= @numberOfRows BEGIN --Select the name of the Server from the MemoryTable, and assign it to the variable @Servername SELECT @ServerName = ServerName FROM @MemoryTable --making sure that the number of the row in the MemoryTable is equal -- to the number of the loopCounter WHERE MemoryTableServerID = @loopCounter --Execute the usp_GetErrorLog stored procedure for each row EXEC dbo.usp_GetFailedJob @servername --Increment the counter and go back up until the value of the counter is the same --as the value of the total number of rows in @MemoryTable SET @loopcounter = @loopCounter + 1 END
Testing that it works
If you've set up more than one linked server
then now's the time to test your Failed Reports monitoring script! Execute usp_Iterate_FailedJobs
and it will go through each row in the [Server_SQL_Details] table, pick out
the servername, determine the version and extract data regarding the failed
jobs on that server. It will then insert it into the Failed_Jobs table. There
you go, you now know all jobs that have failed on all your SQL Servers! Now
set up a single schedule job on your central server, SQL_Admin, to execute usp_Iterate_FailedJobs.
Lovely Jubley!
In the next article we'll create four more stored
procedures needed for the Failed Jobs report. One to retrieve data regarding
the failed job from the job history table for the server in question, one to
retrieve data regarding the failed job from the maint plan table for the server
in question, one to retrieve data regarding the failed job from the SQL Server
logs for the server in question and one displaying a summary of the Failed_Jobs
table for our web front end.
See you then,
Drew