Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


A Failed Jobs Monitoring System


A Failed Jobs Monitoring System

Author
Message
stew_b
stew_b
Grasshopper
Grasshopper (18 reputation)Grasshopper (18 reputation)Grasshopper (18 reputation)Grasshopper (18 reputation)Grasshopper (18 reputation)Grasshopper (18 reputation)Grasshopper (18 reputation)Grasshopper (18 reputation)

Group: General Forum Members
Points: 18 Visits: 142
This is a great article! How can I take it to the next level - I have a multistep job that is executing some but not all of the steps. The SQL logs/tables report job success, but I suspect now it is become at least SOME part of the job succeeded. Without parsing each step out into seperate jobs, is there way to get detailed information about the success or failure of each step within the job?
Strommy
Strommy
Old Hand
Old Hand (322 reputation)Old Hand (322 reputation)Old Hand (322 reputation)Old Hand (322 reputation)Old Hand (322 reputation)Old Hand (322 reputation)Old Hand (322 reputation)Old Hand (322 reputation)

Group: General Forum Members
Points: 322 Visits: 351
Interesting solutions. For large environments, enterprise monitoring software is very useful. Our environment has 200+ SQL instances and we use NetIQ. We finished up a project that monitors all types of problems and will create a service ticket based on the monitoring events. The best part is the set-up for new servers. Install the client, add the server to the proper monitoring group in NetIQ, and we can start getting monitoring events from SQL (including failed and long-running jobs).

Thanks,
Eric



tjaybelt
tjaybelt
SSChasing Mays
SSChasing Mays (619 reputation)SSChasing Mays (619 reputation)SSChasing Mays (619 reputation)SSChasing Mays (619 reputation)SSChasing Mays (619 reputation)SSChasing Mays (619 reputation)SSChasing Mays (619 reputation)SSChasing Mays (619 reputation)

Group: General Forum Members
Points: 619 Visits: 470
that is the next step. its difficult to gather specific info fom the job steps, cause they dont key into the job, just seems they are ordered by time, and linked to a job... so the organization is a bit lacking. step 0 is the job report. other steps could fail. But finding the specific message that failed, and specific step, may be difficult, if the job failes a lot of times in a row...
Im struggling with piecing this together now.



thecosmictrickster@gmail.com
thecosmictrickster@gmail.com
Hall of Fame
Hall of Fame (3.3K reputation)Hall of Fame (3.3K reputation)Hall of Fame (3.3K reputation)Hall of Fame (3.3K reputation)Hall of Fame (3.3K reputation)Hall of Fame (3.3K reputation)Hall of Fame (3.3K reputation)Hall of Fame (3.3K reputation)

Group: General Forum Members
Points: 3336 Visits: 932
@Strommy: In my last job we used NetIQ. A lot easier than Tivoli. You just have to watch out for adjusting child jobs - any changes to the parent job will overwrite it. A gotcha for people who don't watch what view they are in on the console.

@stew_b: A possible solution is to have a step between each to handle failures. Like so:

Step 1: Step1Tasks onsuccess goto step 3, onfail goto next
Step 2: Report failure of step 1 onsuccess goto next, onfail goto next
Step 3: Step3Tasks onsuccess quit with success, onfail goto next
Step 4: Report failure of step 3 onsuccess quit with success, onfail quit with fail


It depends what your steps are doing and whether you can incorporate error-reporting within them (e.g. RAISERROR). Failing that, use the above structure with e.g. RAISERROR('Step x failed', 16, 1) WITH LOG and check the ERRORLOG (manually or automate it) or get it to send an email (you may get spammed, watch out for that!).



Scott Duncan

MARCUS. Why dost thou laugh? It fits not with this hour.
TITUS. Why, I have not another tear to shed;
--Titus Andronicus, William Shakespeare

Strommy
Strommy
Old Hand
Old Hand (322 reputation)Old Hand (322 reputation)Old Hand (322 reputation)Old Hand (322 reputation)Old Hand (322 reputation)Old Hand (322 reputation)Old Hand (322 reputation)Old Hand (322 reputation)

Group: General Forum Members
Points: 322 Visits: 351
Good points about the job failure logic. In general, if we ever have a job that fails and then sends a failure notification, we have that job step exit failed, regardless of success or failure of that step. As an aside, if we have a ticketing system monitoring job failures, we don't need the notification step. That was part of our goals for integrating with NetIQ.

Thanks,
Eric



stew_b
stew_b
Grasshopper
Grasshopper (18 reputation)Grasshopper (18 reputation)Grasshopper (18 reputation)Grasshopper (18 reputation)Grasshopper (18 reputation)Grasshopper (18 reputation)Grasshopper (18 reputation)Grasshopper (18 reputation)

Group: General Forum Members
Points: 18 Visits: 142
Great idea Scott - Thank you!
sqldba-294117
sqldba-294117
SSC Veteran
SSC Veteran (216 reputation)SSC Veteran (216 reputation)SSC Veteran (216 reputation)SSC Veteran (216 reputation)SSC Veteran (216 reputation)SSC Veteran (216 reputation)SSC Veteran (216 reputation)SSC Veteran (216 reputation)

Group: General Forum Members
Points: 216 Visits: 396
I came across this site, so i thought let me share with forum.
SQL job manager
View and manage SQL Server Jobs

Free for a limited time

http://www.idera.com/Products/SQLjobmanager/
FreeHansje
FreeHansje
SSC Eights!
SSC Eights! (885 reputation)SSC Eights! (885 reputation)SSC Eights! (885 reputation)SSC Eights! (885 reputation)SSC Eights! (885 reputation)SSC Eights! (885 reputation)SSC Eights! (885 reputation)SSC Eights! (885 reputation)

Group: General Forum Members
Points: 885 Visits: 810
If you're already setting up a separate server, why not use MOM?
I've looked into that and cannot find a rule to monitor failed jobs. If I have to build this rule myself, then why not go for a tailormade solution anyway?

Personally, this seems like a lot of heavy lifting and manual labor, and the effort involved seems to outweigh the costs of commercial tools already available.
This is correct, however, have you even looked into the extra load some of these tools add to a server? Dunno about Sentry's Eventmanager, but others where a firm no-no after a day or so of testing. Idera JM was 1 of those.

For job monitoring, I just get each server to send an email (per job) when a job fails.
Yes, nice 1, we have more then a 1000 jobs running all over the place, and alas, not only the DBA can go there and add/delete jobs. I prefer a central monitoring system.

However, I miss the job to run this setup. I can figure it out, but do I miss something here? It's not part of the download?

Greetz,
Hans Brouwer
Rudy Panigas
Rudy Panigas
SSC-Addicted
SSC-Addicted (440 reputation)SSC-Addicted (440 reputation)SSC-Addicted (440 reputation)SSC-Addicted (440 reputation)SSC-Addicted (440 reputation)SSC-Addicted (440 reputation)SSC-Addicted (440 reputation)SSC-Addicted (440 reputation)

Group: General Forum Members
Points: 440 Visits: 1303
Using MOM. Yes this is a good tool but not everyone has the budgets to get MOM. This is the situation at my location, although there would be money available now, but I think this is working much better.

We use this tool/setup to also collect additional information from not just SQL servers but also from Oracle servers and MySQL servers as well. Not certain if MOM does Oracle/MySQL but so far this setup is working well.

Just go to show that there are many ways to get things done and you can bet a DBA/Develop will figure something out.

Thanks for the comments,

Rudy



darryl_marshall
darryl_marshall
SSC-Addicted
SSC-Addicted (488 reputation)SSC-Addicted (488 reputation)SSC-Addicted (488 reputation)SSC-Addicted (488 reputation)SSC-Addicted (488 reputation)SSC-Addicted (488 reputation)SSC-Addicted (488 reputation)SSC-Addicted (488 reputation)

Group: General Forum Members
Points: 488 Visits: 360
We use a similar system to this to monitor all our servers. Had a look at Idera SQL Job Manager but it ran very slowly on my machine and we needed more information about other aspects of the servers than just the jobs. At the moment lots of details on disk space, db size, job outcomes and a number of logging tables are pulled back to a central server for reporting against. SQL Reporting service is then used to report against the data and send out email reports as needed.



Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search