Get an email notification of the failed job step detail

  • Comments posted to this topic are about the item Get an email notification of the failed job step detail

  • Nice idea, but I don't like putting potentially long-running code calling things like email into triggers on system tables. What if email hangs for some reason?

    I have a slightly different way to handle failed job steps: an Agent job that runs hourly and populates a table with failed job steps and sends an email alert if any rows are returned.

    If you're wondering why we need this: SQL doesn't report a job as having failed if only a step in the job fails (if, when you set up the job, you told the job to continue if that step fails). So you have to query the system tables.

    An exasperating quirk I discovered, confirmed here on BOL: MS for some unknown reason stores the date and time that the step ran as integers in separate columns run_date and run_time. :crazy: The run_time INT isn't even something like "elapsed seconds after midnight": it's an integer representation of the digital time. So, for example, you have 85959 stored for 08:59:59, and one second later, you'll have 90000 stored.

    If you want the data back as a DATETIME, you'll need to reconstruct it. With some reluctance, I'm posting my script. It's an ugly, string-based reconstruction, and I'm sure many will be happy to jump in with cleaner code:

    SELECT

    j.name AS JobName,--NVARCHAR(128)

    jh.step_id,--INT

    jh.step_name,--NVARCHAR(128)

    jh.sql_message_id,--INT

    jh.sql_severity,--INT

    --sysjobhistory natively stores run_date and run_time as separate integers. Combine and convert to DATETIME. Why MS, why??

    CAST

    (

    --Date portion, which will always be an 8-digit INT in the form yyyymmdd:

    CAST(jh.run_date AS VARCHAR(8)) + ' ' +

    --Time portion is harder, b/c it can be 0, nnnnn (5 digits), or nnnnnn (6 digits) in the form hmmss. No leading zero.

    --This construct will prepend 6 zeroes, then take the rightmost 6 characters, yielding a 6-character string:

    --RIGHT('000000' + CAST(run_time AS VARCHAR(6)), 6)

    --We then slice and re-format to hh:mm:ss and combine with the date, then cast the whole shebang as DATETIME.

    LEFT(RIGHT('000000' + CAST(jh.run_time AS VARCHAR(6)), 6), 2) + ':' +

    Substring(RIGHT('000000' + CAST(jh.run_time AS VARCHAR(6)), 6), 3, 2) + ':' +

    RIGHT(RIGHT('000000' + CAST(jh.run_time AS VARCHAR(6)), 6), 2)

    AS DATETIME) As RunDateTime,

    jh.message,--NVARCHAR(1024)

    jh.run_status,--INT

    jh.run_duration--INT

    FROM MSDB.dbo.sysjobs j INNER JOIN

    MSDB.dbo.sysjobhistory jh ON j.job_id = jh.job_id

    WHERE jh.sql_severity > 0 OR

    jh.run_status = 0

    Thanks for the article!

    Rich

  • I agree that it makes sense to not send an email from the trigger. We needed to be notified with fewest lag possible and there were also multiple sql server agents to be managed with minimal modification/maintenance necessary. The trigger has been running on our 4 servers for the last 10 months and so far emailing (SMTP) from it didn't cause any headache. One thing though, when a job fails we end up getting 2 emails, the original notification followed by the email trigger generates, kinda ugly but beats going to the server and reading the job history manually.

  • Thanks for yet another good script.

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply