Error 17883: Scheduler appears to be hung

  • Let me start out with an apology as I did NOT fully search the forums for related threads. With management breathing down my neck I decided to go straight to the SQL Server oracles (pun intended).

    Problem: We are running SQL Server 2000 SP3 on a Windows 2003 cluster. Hardware on each node is 4 dual core Xenon 2.8GHZ processors with 16Gb of RAM. Today users reported the inability to login and just as I began looking into, SQL Server failed over to the secondary node. Examination of the error logs shows this slice of heaven on both machines (before and after failover)

    [qoute]

    10/18/2007 08:55:41,server,Unknown,Error: 17883 State: 0

    10/18/2007 08:55:41,server,Unknown,The Scheduler 1 appears to be hung. SPID 0 UMS Context 0x03908418.

    10/18/2007 08:55:41,server,Unknown,Error: 17883 State: 0

    10/18/2007 08:55:41,server,Unknown,The Scheduler 3 appears to be hung. SPID 90 UMS Context 0x0390C858.

    10/18/2007 08:55:41,server,Unknown,Error: 17883 State: 0

    10/18/2007 08:55:41,server,Unknown,The Scheduler 4 appears to be hung. SPID 0 UMS Context 0x038FC630.

    10/18/2007 08:55:41,server,Unknown,Error: 17883 State: 0

    10/18/2007 08:55:41,server,Unknown,The Scheduler 7 appears to be hung. SPID 127 UMS Context 0x03909AA0.

    10/18/2007 08:55:12,spid90,Unknown,WARNING: EC 2723c098 3 waited 600 sec. on latch 14fda114. Not a BUF latch.

    10/18/2007 08:55:12,spid90,Unknown,Waiting for type 0x4 current owning EC 0x2C71C098.

    10/18/2007 08:55:12,spid90,Unknown,Waiting for type 0x4 current owning EC 0x2C71C098.

    10/18/2007 08:55:12,spid90,Unknown,WARNING: EC 2ca40098 1 waited 600 sec. on latch 14fda114. Not a BUF latch.

    10/18/2007 08:55:12,spid90,Unknown,WARNING: EC 29e70098 7 waited 600 sec. on latch 14fda114. Not a BUF latch.

    10/18/2007 08:55:12,spid90,Unknown,Waiting for type 0x4 current owning EC 0x2C71C098.

    10/18/2007 08:55:12,spid90,Unknown,WARNING: EC 2d074098 6 waited 600 sec. on latch 14fda114. Not a BUF latch.

    10/18/2007 08:55:12,spid90,Unknown,Waiting for type 0x4 current owning EC 0x2C71C098.

    10/18/2007 08:54:41,server,Unknown,Error: 17883 State: 0

    10/18/2007 08:54:41,server,Unknown,The Scheduler 3 appears to be hung. SPID 90 UMS Context 0x0390C858.

    10/18/2007 08:54:41,server,Unknown,Error: 17883 State: 0

    10/18/2007 08:54:41,server,Unknown,The Scheduler 4 appears to be hung. SPID 0 UMS Context 0x038FC630.

    10/18/2007 08:54:41,server,Unknown,Error: 17883 State: 0

    10/18/2007 08:54:41,server,Unknown,The Scheduler 7 appears to be hung. SPID 127 UMS Context 0x03909AA0.

    [/qoute]

    Also, the final straw was a Machine check error on the primary node which caused the failover.

    So:

    1. How do I put this in layman's terms for Management?

    2. What is the recommended course of action?

    3. Is this a SQL Server problem or a hardware problem

    and

    4. Why does this happen hours before I leave for a 3 day weekend?

    At a loss,

    Gordon

    Gordon Pollokoff

    "Wile E. is my reality, Bugs Bunny is my goal" - Chuck Jones

  • Have a look at

    http://support.microsoft.com/kb/319892

Viewing 2 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply