Managing Jobs - Part 2

  • We had a situation where a job failed for a long time without anyone realizing because it had been set up to report success even when certain individual steps failed. I've been frustrated by the limitations of the built-in failure notification mechanism in this regard, so I decided to put an update trigger on sysjobsteps to catch every time a job step completes abnormally. The trigger executes xp_smtp_sendmail, which is a well-known 3rd party SMTP xproc. xp_smtp_sendmail has good support for attachments, so I set up the trigger to attach the job log file to the notification message that is sent.

    Putting triggers on system tables is almost always a VERY BAD IDEA, but it seems to work OK in this particular case. There's a single column that records the status of the step and whether its last run failed or was successful or was cancelled, etc. You just need to check in the trigger what's being written to this column.

    Making sure the SQL Agent is always running is a big concern, as is monitoring when a job hangs and takes much longer than usual. Interested to read your thoughts on these issues.

    E. Titus

  • If the trigger works, I guess its ok - Im leaning towards polling from off the box instead, but the results about the same. Have not tried xp_smtp, but I think a very good idea, built in mail is...finicky? Havent thought through what I need entirely, so far I just see the symptoms. One thing I thought of when writing this is whether using master/target servers would be of any use. Seems like you'd have the same issues, but maybe not.

    Andy

    http://www.sqlservercentral.com/columnists/awarren/

  • You might want to mention the option of turning on a job log in the advanced section of each job step. The first job step is set to overwrite the log and the subsequent steps are set to append to it. This is very basic, but I overlooked it in the first set of jobs I wrote. Also you should design the job steps so that they write meaningful messages (record counts, etc).

    The one thing I'd like to see from SQL Server is the ability to have the log output datestampted and to be able to keep n versions of each log. If you know a way to implement this, it would be very valuable.

  • I think I'd put that more under debugging than managing, but it probably depends on what you're doing. Logs can be incredibly useful tools. A related point is that if you're using VBScript you can use print statements in the job and they show up in the jobhistory.

    Andy

    http://www.sqlservercentral.com/columnists/awarren/

  • I agree with you Andy, I prefer to have notification only on failure. However I have a situation where I am often importing vsat quantities of data from a remote Informix DB into a SQL Server data warehouse. The data is being transferred over a 10 MB link and is scheduled to run at 8pm and usually takes an hour, however sometimes it is still running in the morning. Obviously it has not failed therefore I don't get any notification and have to manually stop the job, in your article you mention that you run a second job to check the history of the first. Could you give more details on this please.

  • I'll have a third (final?) article on the subject up next week, I'll include some sample code for that case.

    Andy

    http://www.sqlservercentral.com/columnists/awarren/

  • We had a problem where imports would hang when importing from Access MDB's on a nightly basis. I ended up looking at some system stored procedures and based upon that created a job to look for hung import jobs. This job runs after the expected MDB imports should have completed and only looks for jobs classified using a certain name. If an import job is running the check job attempts to stop and restart it. If any import jobs are found hung, an email is sent to an email distribution list.

    -- Step 1: Create intermediate work tables

    CREATE TABLE #xp_results (job_id UNIQUEIDENTIFIER NOT NULL,

    last_run_date INT NOT NULL,

    last_run_time INT NOT NULL,

    next_run_date INT NOT NULL,

    next_run_time INT NOT NULL,

    next_run_schedule_id INT NOT NULL,

    requested_to_run INT NOT NULL, -- BOOL

    request_source INT NOT NULL,

    request_source_id sysname NULL,

    running INT NOT NULL, -- BOOL

    current_step INT NOT NULL,

    current_retry_attempt INT NOT NULL,

    job_state INT NOT NULL)

    CREATE TABLE ##tsp_help_job

    (namesysname)

    -- Step 2: Capture job execution information (for local jobs only since that's all SQLServerAgent caches)

    DECLARE @is_sysadmin INT

    DECLARE @job_owner sysname

    SELECT @is_sysadmin = ISNULL(IS_SRVROLEMEMBER(N'sysadmin'), 0)

    SELECT @job_owner = SUSER_SNAME()

    INSERT INTO #xp_results

    EXECUTE master.dbo.xp_sqlagent_enum_jobs @is_sysadmin, @job_owner

    -- Step 3: Retrieve executing job names

    -- WHICH ARE IDENTIFIED with category 'MDB Import'

    DECLARE @category_id INT

    SELECT @category_id = category_id

    FROM msdb.dbo.syscategories

    WHERE (category_class = 1) -- Job

    AND (name = 'MDB Import')

    INSERT INTO ##tsp_help_job

    SELECT sjv.name

    FROM #xp_results xpr,

    msdb.dbo.sysjobs_view sjv

    WHERE (sjv.job_id = xpr.job_id) AND (sjv.category_id=@category_id)

    AND (xpr.running = 1)

    -- Step 4: Process if found

    IF (SELECT COUNT(*) FROM ##tsp_help_job) > 0

    BEGIN

    DECLARE hungjob_csr CURSOR FOR

    SELECT Name FROM ##tsp_help_job

    DECLARE @JobName sysname, @RetVal int

    OPEN hungjob_csr

    FETCH NEXT FROM hungjob_csr INTO @JobName

    WHILE @@FETCH_STATUS = 0

    BEGIN

    EXEC @retval = SP_STOP_JOB @job_name=@JobName

    IF (@retval = 0)

    BEGIN -- wait one minute and try to restart job

    WAITFOR DELAY '00:01:00'

    EXEC SP_Start_JOB @job_name=@JobName

    END

    FETCH NEXT FROM hungjob_csr INTO @JobName

    IF (@@FETCH_STATUS = 0) --- if another hung job wait 5 minutes

    WAITFOR DELAY '00:05:00'

    END

    DEALLOCATE hungjob_csr

    EXEC master.dbo.xp_sendmail @subject='MDB Imports Hung - Attempted to Restart',@recipients='MDBHung_DistList',

    @query='SELECT Rtrim(Left(Name,60)) FROM ##tsp_help_job', @no_Header='TRUE'

    END

    -- Step 5: Clean Up

    DROP TABLE ##tsp_help_job

    DROP TABLE #xp_results

  • I wrote a small stored proc for job monitoring.

    This is usefull when you are not present in office (specially weekends) and you want to know whether all jobs run successfully or not. (even you can file up the output for your auditors)

    i call my proc with 2 params year and month.

    and it gives me a nice overall picture with all SQL jobs.

    Further enhancements for the same would be to include job category.

    Also the current proc does not handle jobs that are multiple times a day.

    so i am in a process where i can alter proc to show output as 3F, 1C, 2S (3 Failures 1 Cancel & 2 Success for that particular Day)

    I will post the new proc when i get time to do that.

    Current Proc is as follows.

    /*=========================================================================

    Title: Monthly SQL Server Agent Jobs report

    Script C:\DBA\SCRIPTS\sp_monthly_jobreport.sql

    Purpose: Monthly SQL Server Agent Jobs report

    output to be copied to excel files

    Author: Amit Jethva

    Date Created: 2003-10-28

    Date Last Updated:

    By:

    Note:

    =========================================================================*/

    create proc sp_monthly_jobreport ( @year int , @month tinyint )

    as

    select j.name as [JobName], substring( convert(varchar, run_date ) , 7, 2) as [Day] ,

    max( case run_status when 1 then 'S' when 0 then 'F' when 2 then 'R' when 3 then 'C' else 'P' end ) as [Status]

    into #jobs

    from msdb..sysjobhistory h , msdb..sysjobs j

    where j.enabled = 1

    and j.job_id = h.job_id

    and run_date between ( ( @year * 10000 ) + ( @month * 100 ) + 1 ) and

    ( ( @year * 10000 ) + ( @month * 100 ) + 32 )

    and h.step_id = 0

    group by j.name , substring( convert(varchar, run_date ) , 7, 2)

    select JobName ,

    max(case Day when '01' then Status else '' end ) As [01],

    max(case Day when '02' then Status else '' end ) As [02],

    max(case Day when '03' then Status else '' end ) As [03],

    max(case Day when '04' then Status else '' end ) As [04],

    max(case Day when '05' then Status else '' end ) As [05],

    max(case Day when '06' then Status else '' end ) As [06],

    max(case Day when '07' then Status else '' end ) As [07],

    max(case Day when '08' then Status else '' end ) As [08],

    max(case Day when '09' then Status else '' end ) As [09],

    max(case Day when '10' then Status else '' end ) As [10],

    max(case Day when '11' then Status else '' end ) As [11],

    max(case Day when '12' then Status else '' end ) As [12],

    max(case Day when '13' then Status else '' end ) As [13],

    max(case Day when '14' then Status else '' end ) As [14],

    max(case Day when '15' then Status else '' end ) As [15],

    max(case Day when '16' then Status else '' end ) As [16],

    max(case Day when '17' then Status else '' end ) As [17],

    max(case Day when '18' then Status else '' end ) As [18],

    max(case Day when '19' then Status else '' end ) As [19],

    max(case Day when '20' then Status else '' end ) As [20],

    max(case Day when '21' then Status else '' end ) As [21],

    max(case Day when '22' then Status else '' end ) As [22],

    max(case Day when '23' then Status else '' end ) As [23],

    max(case Day when '24' then Status else '' end ) As [24],

    max(case Day when '25' then Status else '' end ) As [25],

    max(case Day when '26' then Status else '' end ) As [26],

    max(case Day when '27' then Status else '' end ) As [27],

    max(case Day when '28' then Status else '' end ) As [28],

    max(case Day when '29' then Status else '' end ) As [29],

    max(case Day when '30' then Status else '' end ) As [30],

    max(case Day when '31' then Status else '' end ) As [31]

    from #jobs

    group by JobName

    drop table #jobs


    -- Amit



    "There is no 'patch' for stupidity."



    Download the Updated SQL Server 2005 Books Online.

  • which type of article you have? can you please tell me?

    ==========================================

    vernon getzler

    administration jobs in london

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply