Killed Process still rolling back

  • So the night shift guy was running a process that had a problem, so he killed it through Activity Monitor. It had run about 20 minutes before he killed it. It now says "KILLED/ROLLBACK", but has been well over an hour. In my limited experience, rollbacks have taken the same time or less than the original process. CPU and Physical I/O both show 0, so I don't know what's going on there. Is there another way to look at a SPID and see what's really going on ... if anything ?

    EDIT: You can ignore this post. Bad info due to too many of us here working on not enough sleep.

  • Think the process is in ROLLBACK and once its completed the process will get deleted

  • It just seems strange that a rollback can take more than 3 or more times the original time of the process !

  • Sometimes it takes that much time, depending on what it was doing before kill.

    DBCC INPUTBUFFER(SPID) would tell you the current status.

    Pradeep Adiga
    Blog: sqldbadiaries.com
    Twitter: @pradeepadiga

  • SQL Server (2000, 2005, 2008): Recovery/Rollback Taking Longer Than Expected

    (Microsoft PSS Guys Comments)

    Interesting discussion around rollback. The transaction has run for 14 hours and then a KILL SPID was issued. The SPID goes into rollback and happed to do so for 2 days and 4 hours.

    The natural question is why not 14 hours to rollback?

    It is hard to tell without some specific details but if the original query used parallel workers it could have done more than 14 hours of work. For example if the update ran with 4 workers, in parallel, it would have done 56 hours of work or 2.3 days. Rollback is mostly single threaded so it might take 4x times as long to complete the rollback. Also, read ahead is often more efficient during execution because the plan is known where rollback has to work from log records and might not be able to drive buffer fetches in the same way.

    The next question was: Should I restart the SQL Server service? Would that be faster?

    Generally it is not faster because you have to start recovery again from the begin transaction and determine where to start fix-ups of the databases. This means lots of I/O again and this time with a cold cache and probably no other access to the database.

    Still Think Recovery/Rollback Is Stuck

    If you really think that recovery or rollback should have completed you can issue the following T-SQL commands to capture a dump of the SQL Server process and provide it to Microsoft SQL Server Support for further analysis.

    dbcc traceon(-1, 2546, 2551)

    dbcc stackdump(1)

    dbcc traceoff(-1, 2546, 2551)

    This will generate a .mdmp file in the \LOG directory. Continue to allow the recovery or rollback to continue and contact Microsoft SQL Server Support for further assistance.

    Large Log File

    You should also be aware that allowing the log file to grow large and generate huge numbers of VLFs could increase recovery time.

    If the database being recovered has a lot of VLFs (Virtual Log Files), the discovery phase that occurs during pre-recovery can take a long time to complete. The number of rows returned by DBCC LOGINFO (dbname) will confirm if this is a possibility. In a typical problem scenario, this query will return several 1000 in a difficult scenario 1.2 million VLFS. This problem could occur even if there are no transactions to recover. The best solution is to shrink the log file to a very small value, confirm that the number of VLFs is few 100s or less, then re-size the log by expanding (or growing ) in a large increment.

    MJ

  • I have a similar problem - killed processes still rolling back more than a day later - but a linked MySQL server is involved. Some of the processes that are "stuck" are just Select queries, so I wouldn't think there would be anything to rollback (for example: select * from openquery([MYSQL-WARRANTS],'select * from warranttab'). Kill with statusonly shows that it is 0% complete with 0 seconds remaining. After waiting a day, we rebooted the MySQL server hoping the process might just error out, but it still shows Running and Killed/Rollback. We tried stopping and starting the DTC. Does anyone know of any other way to cancel these processes other than stopping and starting the SQL server and hoping they end?

  • In the past i've mentioned to Microsoft that the option to abort the rollback (leaving DB corrupt) in favour of a restore would be a handy option - albeit dangerous if used incorrectly! I've had a few scenarios where i know a restore would have taken hours less than the rollback. What do others think?

  • No way. MS does that and the next thing CSS/support is getting all sorts of calls "SQL Server corrupted my database!"

    There must never be a documented/supported command or statement that leaves a DB corrupt. Want to restore? Drop the DB and restore.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • I suppose that wouldn't be very safe, but in my case, I was exclusively outputting to a little MySQL database that only contained data that I passed to it, so there really was nothing at all that needed rolling back. Stopping and restarting the SQL server did end the jobs that needed to go away, though.

  • In your specific case.

    If there was a supported command that did what you're asking, it would have to be supported for everyone, from top DBAs that know the risk down to someone that looking at SQL for the first time. A supported "break my DB beyond recovery" statement is something I hope never appears in the product

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • dlautwein (8/22/2012)


    I have a similar problem - killed processes still rolling back more than a day later - but a linked MySQL server is involved. Some of the processes that are "stuck" are just Select queries, so I wouldn't think there would be anything to rollback (for example: select * from openquery([MYSQL-WARRANTS],'select * from warranttab'). Kill with statusonly shows that it is 0% complete with 0 seconds remaining. After waiting a day, we rebooted the MySQL server hoping the process might just error out, but it still shows Running and Killed/Rollback. We tried stopping and starting the DTC. Does anyone know of any other way to cancel these processes other than stopping and starting the SQL server and hoping they end?

    I also have same problem. My job pull data from teradata. It was running for last 3 days so i killed it 4th day, however now from last 2 days its in recovery. I had disabled daily schedule of job. Dontknow if restart of services will fix it.

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply