Frozen Jobs, SPIDS, and Rollback

  • Well this is annoying. Two jobs got stuck this morning in production (running for 3 hours). I stopped them some time ago, but the SPIDs are still showing as Suspended and Runnable (two lines on SP_WHO2 for both SPIDS).

    I try to kill the SPIDS and get:

    SPID 101: transaction rollback in progress. Estimated rollback completion: 0%. Estimated time remaining: 0 seconds.

    SPID 97: transaction rollback in progress. Estimated rollback completion: 0%. Estimated time remaining: 0 seconds.

    I ran sp_lock and got the following:

    spiddbidObjIdIndIdTypeResourceModeStatus

    975500DB SGRANT

    975500DB UGRANT

    975500DB[BULKOP_BACKUP_DB] UGRANT

    975500DB[BULKOP_BACKUP_LOG] SGRANT

    1015400DB SGRANT

    1015400DB UGRANT

    1015400DB[BULKOP_BACKUP_DB] UGRANT

    1015400DB[BULKOP_BACKUP_LOG] SGRANT

    The jobs stopped, and failed, like they were supposed to. I assume this means they rolled back. But this says there's a rollback remaining even though there's no estimated time left on the rollback clock...

    Any thoughts?

    EDIT: These were restore jobs. That's why the backup is involved.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • Restart SQL. I've seen this a couple times at a client, never dug out the root cause (I suspect there's some non-SQL resource being held somewhere).

    Irritating, I know.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Gail,

    I think I just found something. Read Jag's post here: http://social.msdn.microsoft.com/forums/en-US/sqldatabaseengine/thread/71d3cd0b-ce4d-4202-8e3b-38b38da90a5b%5B/url%5D. It sort of makes sense to me, but I'm wondering what else could be affected if I take this advice.

    EDIT: I can't restart SQL. This is a production box.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • Brandie Tarvin (12/21/2011)


    Gail,

    I think I just found something. Read Jag's post here: http://social.msdn.microsoft.com/forums/en-US/sqldatabaseengine/thread/71d3cd0b-ce4d-4202-8e3b-38b38da90a5b%5B/url%5D. It sort of makes sense to me, but I'm wondering what else could be affected if I take this advice.

    Not sure if that's applicable? Are the jobs just jobs or are they maintenance plans? If they're just jobs, there isn't an OS process involved.

    EDIT: I can't restart SQL. This is a production box.

    Just SQL, not the server. If that is the issue I had, you won't be able to take any full backups (or diff) until you restart SQL. See if you can schedule 5 min down soon (doesn't have to be now)

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • I know you meant the service. It's just that we've become a not-quite-24 x 7 shop with over 20 databases that multiple applications depend on, so finding a time in which to restart the service is getting to be a pain.

    This whole mess started when these databases tried to restore while the de-duplication backup jobs were also trying to run (why they chose to run this in the morning, I'll never know). We're not using native SQL backup anymore, nor the usual third party tools. Since the de-dup backups are done by a different team with a different sort of tool, I don't know what issues this problem will cause them.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • Well, you were right. There was no other way. We had to do the service restart over the holiday weekend, but it did work. In this case, it didn't prevent all database backups and restores, just the restores of two reporting databases.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • We had the same error. What it was was that our job calls and outside PadFiles.exe (a little VB program). It's holding the thread. Once we endtask from Task Manager, SQL Server releases the thread.

    David LaCourse

  • I had this problem.

    I was concerned that if I restarted the SQL Server Service that the Database would perform a recovery.

    I checked the error log and I do not see that the Database is recoverying.

    I had this situation before and I forced it to recover.. I can't remember what i did.

    If I attempt to take the Database online. I get the following error:

    Msg 5011, Level 14, State 7, Line 2

    User does not have permission to alter database 'DW', the database does not exist, or the database is not in a state that allows access checks.

    Msg 5069, Level 16, State 1, Line 2

    ALTER DATABASE statement failed.

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • I had to restrt SQL Server a 2nd time.

    I was stuck at recovery at 0%. It finally moved to 2%.

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Corgi,

    I have since learned that if you kill the SPID (not the job), then the SPID will stop the job on its own and rollback. It may take a while, but it works.

    If you kill the job, though, all bets are off.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • I had this issue with an SP that was using BCP just to output a file.

    On the SQL Server, I ended the bcp.exe process in Task Manager, and the SPID vanished.

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply