Runaway Job Management

  • Mark Marinovic

    SSCrazy

    Points: 2421

    Comments posted to this topic are about the item Runaway Job Management

  • Lawrence-136504

    SSC Eights!

    Points: 901

    Which versions have you tested this logic on? I'm getting below error.

    Msg 8114, Level 16, State 1, Procedure xp_sqlagent_enum_jobs, Line 1

    Error converting data type nvarchar to int.

  • Mark Marinovic

    SSCrazy

    Points: 2421

    Lawrence-136504 (3/18/2010)


    Which versions have you tested this logic on? I'm getting below error.

    Msg 8114, Level 16, State 1, Procedure xp_sqlagent_enum_jobs, Line 1

    Error converting data type nvarchar to int.

    Lawrence:

    I have this code running on both SQL 2005 (SP3+CU6) and SQL 2008 (SP1+CU5) in production. Would you be able to paste the following line into the script so we can see what is being sent to that stored procedure?

    <snip>

    INSERT INTO @JobState

    EXEC [master].dbo.xp_sqlagent_enum_jobs @IsSysAdmin, @JobOwner, @JobID

    --Paste this line into your script

    PRINT 'EXEC [master].dbo.xp_sqlagent_enum_jobs ' + CAST(@IsSysAdmin AS VARCHAR(38)) + ', ''' + @JobOwner + ''', ''' + CAST(@JobID AS VARCHAR(36)) + ''''

    --Normalize VARCHAR string for time formatting

    </snip>

    Thank you!

    Mark

  • Lawrence-136504

    SSC Eights!

    Points: 901

    I'm using sql 2008 sp1.

    I do not get an error if the job is not running (EXEC [master].dbo.xp_sqlagent_enum_jobs 1, 'domainname\lawrence', 'B508F33F-8C69-42B9-9635-4CA66C05548D').

    If the job is running I get the above error.

    If I run this script and job is not running with "Alert", it sends a message to the Operator that the job has been running and the is stopped and disabled.

  • Lawrence-136504

    SSC Eights!

    Points: 901

    I re-copied the logic and then pasted it into ssms (query window) and removed some hidden characters and I'm getting below error now.

    EXEC [master].dbo.xp_sqlagent_enum_jobs 1, 'DomainName\lawrence', 'B508F33F-8C69-42B9-9635-4CA66C05548D'

    Msg 241, Level 16, State 1, Line 56

    Conversion failed when converting date and/or time from character string.

    Looks like it's bombing out on the DATEDIFF logic. Thoughts.

  • Mark Marinovic

    SSCrazy

    Points: 2421

    Lawrence-136504 (3/18/2010)


    I re-copied the logic and then pasted it into ssms (query window) and removed some hidden characters and I'm getting below error now.

    EXEC [master].dbo.xp_sqlagent_enum_jobs 1, 'DomainName\lawrence', 'B508F33F-8C69-42B9-9635-4CA66C05548D'

    Msg 241, Level 16, State 1, Line 56

    Conversion failed when converting date and/or time from character string.

    Looks like it's bombing out on the DATEDIFF logic. Thoughts.

    Thanks, I was getting the same thing with the weird chars.

    Looks like the problem with my DATEDIFF logic is that it assumes a recurrence of the job (with a non-recurring job the date is stored as 0 by SQL Server). My code looks at the last time the job would have run from the schedule and finds the difference between then and GETDATE() to come up with an answer.

    Is that the case with the job you're looking at?

    MJM

  • Lawrence-136504

    SSC Eights!

    Points: 901

    Yes. I was manually starting this job so there was no schedule.

    I created a schedule that runs every 1 minute. This job has one step that has WAITFOR DELAY '00:05:00' so it runs for 5 minutes. I'm now seeing the operator email. 🙂

    I'm currently using sysprocesses and sysjobs to capture long running jobs and it has been great but I'm finding that certain sql jobs that use Type: Operating system (CmdExec) and kicks off a console app (workflow), the job hangs and my current process does not capture this issue.

    I'm going to add this process and see how it goes. I'll keep you posted in a couple of weeks. Thks!

  • Mark Marinovic

    SSCrazy

    Points: 2421

    Lawrence-136504 (3/18/2010)


    Yes. I was manually starting this job so there was no schedule.

    I created a schedule that runs every 1 minute. This job has one step that has WAITFOR DELAY '00:05:00' so it runs for 5 minutes. I'm now seeing the operator email. 🙂

    I'm currently using sysprocesses and sysjobs to capture long running jobs and it has been great but I'm finding that certain sql jobs that use Type: Operating system (CmdExec) and kicks off a console app (workflow), the job hangs and my current process does not capture this issue.

    I'm going to add this process and see how it goes. I'll keep you posted in a couple of weeks. Thks!

    Oh, good deal! Thanks for bringing that issue to my attention. The intent of this script is to do exactly what you're doing - monitor recurring jobs for problems. When there is not a set schedule, well, I plain did not cover that case :rolleyes:

    Please let me know how this works for you - I'll update the code that's posted to do something in the case that there's no schedule set.

    Take care:

    MJM

  • Lawrence-136504

    SSC Eights!

    Points: 901

    I'll let you know how it goes.

    Also, it would be great to instead of just monitoring one job but monitoring all jobs or all jobs within a certain Job Category. Thks!

  • akljfhnlaflkj

    SSC Guru

    Points: 76202

    Thanks for the script.

Viewing 10 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply