Agent backup job never completes

  • I've got a maintenance plan that in SQL Server 7 (SP4) that does several things and does them all quite well... for a while. After running optimizations, full backups, and transaction log backups fine for days, the transaction log backup will eventually fail at some unpredictable time in the day. It doesn't actually say it failed (or generate any of those helpful MS error messages), when you check the SQL Server Agent in EM, it just says that the status is "Job Executing". It just never finishes. Other jobs that don't depend on that job continue just fine so Agent itself is not hung.

    I've got plenty of hard disk space where these backups are made. I can't find any entries in the log files around the time in question. 100's of backups work fine over several days (no, not the same number each time). I've put up with this for weeks, chasing it when I have time.

    Worse, the only way to get the backups going again is to reboot the server. Stopping and starting SQL Agent doesn't get it done. While I'm working on the real problem, I'd sure like to know how to reset Agent without a reboot.

    Thanks in advance all you gurus....

    Student of SQL and Golf, Master of Neither


    Student of SQL and Golf, Master of Neither

  • Check maintenance history to see any errors there.

    Any jobs created by maintenance wizard actually is running a DOS program sqlmaint.exe. The program may run endless for whatever reasons so the job will execute forever.

    In order to avoiding reboot the server, login to server, open the task manager and kill corresponding sqlmaint.exe will stop the job running.

  • Allen, you are a gentleman and a scholar. Just for the sake of caution, I stopped SQL Server Agent first, then killed the sqlmaint.exe task in Task Manager, then restarted Agent.

    My 5:00 backups went off without a hitch and no reboot.

    Thank you, thank you, thank you. Now why does the damn thing get hung in the first place? No info in the maintenance history either by the way - just the previous successful completions. Same with NT event log.

    Student of SQL and Golf, Master of Neither


    Student of SQL and Golf, Master of Neither

  • Run sp_who2 and check if there's a spid with backup commands and check if it is locked. If so, check the job, because to backups may be running at the same time.

  • Maybe a timeout in the backup? Lack of disk space? Killing sqlmaint is a great tip, depending on the process not always so easy.

    Andy

    http://www.sqlservercentral.com/columnists/awarren/

  • Since Allen's excellent post several days ago, we've been killling sqlmaint.exe from the NT Task Manager a couple of times a day. I'm still quite thankful for that bit of advice, I feel much better when regular backups are being made.

    As to the last two suggestions, there are no other processes doing backups (we're a bit small for multiple DBAs) and there are many gigabytes of disk space.

    But many thanks, and keep thinking!

    Student of SQL and Golf, Master of Neither


    Student of SQL and Golf, Master of Neither

  • You might try running sqlmaint directly instead of via the agent, that if something is going wrong (like a msgbox popping!) you have a chance of seeing it. I've had a few problems here and there, nothing near to having to kill it several times a day.

    Andy

    http://www.sqlservercentral.com/columnists/awarren/

  • What about writing SQL scripts to replace the jobs which were craeted by maintenance plan to perform the same tasks, such as transaction log backup?

  • Decent idea. Im stubborn enough I'd want to make the plans work!

    Andy

    http://www.sqlservercentral.com/columnists/awarren/

  • Me too. Seems like it's failing during times when CPU usage is high. Often at lunch and around 4 when a lot of our customers are running updates to our server after doing a lot of local work.

    Student of SQL and Golf, Master of Neither


    Student of SQL and Golf, Master of Neither

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply