Something stopped SQL Server Agent unexpectedly

  • richmondlake

    SSC Journeyman

    Points: 88

    Two nights in a row, during the same 1-hour time frame, something shut off SQL Server Agent in my SQL Server 2000 box. I noticed this because all of jobs were failing…so when I looked into the box, the SQL Server Agent service had been stopped.

    There is no mention of it stopping in the logs and there is not a lot of info that seems to be related in the Windows event logs either.

    It is as if something did not like it and disabled it without leaving a trace.

    The “server” people seem to think it is my issue and there must be something in the database doing it. I think unless I make a SP to turn it off or dial in and shut it off, that it cannot do it from within the database.

    Has anyone else had this type of issue occur?

    I have had a few issues in the past where a job or two do not fire due to invalid user name and password but I attributed it to a security script or process running outside the box that for whatever reason did not recognize the Agent but to actually come in and turn it off…two nights in a row…at the same time each night…

    I don not know what to make of this.

    The only items in the Windows logs are for back ups of the drives and system files on the server but not the database files. Other than that, there is a security script being run around the same time but that is from the security people, not me.

  • Mike - CI

    SSCarpal Tunnel

    Points: 4146

    I have had this happen to me before on an old SQL 2000 server as well, and it ended up being due to Schedule Blocking at the database level. Obviously that may not be your situation but I think the way you diagnose it would be similar.

    You started by looking at the windows event log, and I am guessing you have looked at the SQL Error log as well? If not, definitely do so.

    Next since you know the exact time frame for this error schedule a server side trace to run at that time, so that you can see exactly what is happening on the server at that time. If it is SQL related you should be able to determine that fairly quickly.

    If you are still convinced it is not a SQL side issue, gather perfmon data for that time period to see what the counters are on the server (processor, memory, page splits, cache...whatever you think you need). The more information you gather, the better.

    One of these things will surely point you to what is causing the Agent to stop. It is most likely something within SQL but at least this will tell you for sure.

  • richmondlake

    SSC Journeyman

    Points: 88

    So…Schedule blocking can turn off SQL Server Agent?

    Physically turn it off, completely?

    Oh, and I ckecked the SQL Error logs, no new information.

  • george sibbald

    SSC Guru

    Points: 104200

    Other than that, there is a security script being run around the same time but that is from the security people, not me.

    what does that script do? Security people LOVE disabling things!

    ---------------------------------------------------------------------

  • richmondlake

    SSC Journeyman

    Points: 88

    Do not know, they will not tell me nor do they tell me when things run.

    Also, last time something like this happened, SQL Agent did not get shut down. I have experienced the occasional “Job Failed because the user privileges are insufficient” or something like that but never have seen SQL Agent get turned off.

    It is as if something turned off the SQL Server Agent Windows Service it self. There is no trace at all in any of my logs of anything touching Agent.

    All I can see is that there were some back ups going on (system files, not SQL backups) and some script was running but I have been “promised” nothing has been changed in the scripts or group policies.

    IF something within SQL can turn off Agent, wouldn’t I be able to find a trace of it somewhere?

    Side note;

    Maybe worth mentioning that after a re-boot a few months ago(presumably after a Windows update or patch) when the SQL came back up, something had set a database to be AUTOCLOSE ON so every time something called the database, there is a log entry saying: Starting Up Database”XYZ”.

    I see no reference to locks, no memory page failures nothing.

    The actual errors the packages gave after the Agent was turned off were these:

    First Job run after Agent turned off;

    Executed as user: serverdomain\serveruseraccount. DTSRun: Loading... Error: -2147467259 (80004005); Provider Error: 5105 (13F1) Error string: Device activation error. The physical file name 'd:\Program Files\Microsoft SQL Server\MSSQL\data\XYZ_Data.MDF' may be incorrect. Error source: Microsoft OLE DB Provider for SQL Server Help file: Help context: 0. Process Exit Code 1. The step failed.

    And this one came up and was the only error from then on.

    Executed as user: serverdomain\serveruseraccount. DTSRun: Loading... Error: -2147467259 (80004005); Provider Error: 4064 (FE0) Error string: Cannot open user default database. Login failed. Error source: Microsoft OLE DB Provider for SQL Server Help file: Help context: 0. Process Exit Code 1. The step failed

  • MANU-J.

    SSC-Dedicated

    Points: 31126

    Better run a profiler trace and analyze the statements that were under execution when agent broke out.

    MJ

  • Mike - CI

    SSCarpal Tunnel

    Points: 4146

    richmondlake (4/21/2009)


    So…Schedule blocking can turn off SQL Server Agent?

    Physically turn it off, completely?

    Oh, and I ckecked the SQL Error logs, no new information.

    Scheduler blocking is what caused the Agent to stop when I experienced this a couple of years ago. Bear in mind, schedule blocking not process blocking. Essentially what was happening was a third party piece of reporting software was running in some sort of a loop that caused certain processes to each run 40-50 times each which was causing the error.

    Just did a quick search, a couple others had the error that I was seeing

    http://www.sqlservercentral.com/Forums/Topic473721-5-1.aspx

    However, this was giving me errors in the error log. And once I traced and found the processes that were causing the problem, I was able to kill them and resolve the issue and the Agent has not stopped sense. Obviously you are not seeing these errors in the error log so it would stand to reason that you are not seeing the same issue. However, based on this experience I do know that it is possible for something at the SQL level to cause the Agent to stop (at least on SQL 2000). Rare but it does happen.

  • richmondlake

    SSC Journeyman

    Points: 88

    I did confirm that an outside back up software was running on this box both nights at the exact time in question and was trying to back up every MDF and LDF and there are errors in the logs about every single file failed to back up because it was in use.

Viewing 8 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply