Cancelled transaction log backup rolling back???

  • Over the weekend, my backup volume ran out of space and couldn't complete a transaction log backup. That problem has since been solved. However, I tried killing the job yesterday, and today it says it still has over 350K seconds to complete a rollback of the backup. I do not understand what is rolling back, since there are no changes going on in a backup. In the mean time, I'm not getting log backups since the failed job is blocking new jobs.

    The database is a little over a TB in combined data and log files. It is replicating to a couple of other of servers.

    I have enough room to keep growing the log for a few days, so I'm going to hold off until the weekend to make any changes.

    I have two options I'm considering, but am open to other suggestions. I could either restart SQL and see if the job just dies and doesn't come back, but risk a long recovery, or I could set the DB to simple mode, then back to full, then run a full backup immediately (moving it to simple won't kill my replications, right?). I lean towards the latter.

    Initially, I was afraid that even growing the log might be blocked due to the rollback, but I was able to grow it. I wonder if it would allow me to change it to simple mode.

    I found the following two links, that say either option may be viable, but I doubt they are dealing with databases of 1TB or more in size... any suggestions? Pros/cons for one vs. another?

    http://www.eggheadcafe.com/software/aspnet/33496076/kill-long-running-transac.aspx

    http://social.msdn.microsoft.com/Forums/en-US/sqldisasterrecovery/thread/0d3ac0f3-0bb8-4656-833b-64d04d649431

  • Follow the link, where the problem was also about Log file growing too large to handle.

    http://www.sqlservercentral.com/Forums/Topic826361-146-1.aspx

    Regarding changing the recovery model of the Database, I don't think it would allow while rollback is going on.

    If it is 350K records, how long it would take, not much longer right?

    You must already been keeping an eye on the # of records being rolled back.

    Why in the first place, did it grow that large?


    Bru Medishetty

    Blog -- LearnSQLWithBru

    Join on Facebook Page Facebook.comLearnSQLWithBru

    Twitter -- BruMedishetty

  • The log file isn't too large-- it just didn't free any space inside it since the log backup didn't occur. It is growing because of the amount of data we are purging from it daily and the amount of data we mine.

    It isn't 350K records, it is saying that it is 40% complete in killing the job and that it will be another 350K SECONDS (over 4 days) to complete. That is up from 250K seconds yesterday, and still growing.

    Also, this isn't the log file rolling back transactions, it is the backup I canceled that says it is rolling back.

  • Now at 562K seconds. Log file is not growing significantly at this time.

    Anyone?

  • Over 693K seconds.

    I'm planning on changing to simple recovery mode, then back to full, then running a full backup unless someone can tell me a good reason not to.

    Since I haven't gotten a log backup all week, I can't imagine it being any worse than where I'm currently at.

  • Something is blocking it from rolling back the backup. Usually, that just takes a few seconds. For it to be continually increasing the time remaining, something has to be blocking it.

    Are you in a position where you can force a failover to a backup server (cluster, log ship, whatever), and restart the SQL service on the one that's messed up? I've had to do that in similiar situations. It's a last resort, but you may need to head that way if your current plan doesn't work.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Did you try to kill the Backup process id manually? Kill the backup process ID forcefully and then run the full backup. Changing the database from Full to Simple will not impact your replication setup. It will work.

  • GSquared (12/4/2009)


    Something is blocking it from rolling back the backup.

    According to SQL, nothing is blocking it.

  • Atul DBA (12/4/2009)


    Did you try to kill the Backup process id manually? Kill the backup process ID forcefully and then run the full backup. Changing the database from Full to Simple will not impact your replication setup. It will work.

    Thanks Atul!

    Yes, I did try killing the ID, which is how I noticed it was rolling back (should have noticed that from the uspwho).

    Running the command:

    Kill 82

    results in:

    SPID 82: transaction rollback in progress. Estimated rollback completion: 40%. Estimated time remaining: 717863 seconds.

    At 8PM I'll be coming back into the office and changing from full to simple, then back again, verify that the process is dead/gone, and running a new full. Wish me luck!

  • I am not sure about you OS. You can use the /F option with KILL command to make it forcefully kill the process. If you have TaskKill utility installed, you can use TaskKill /F /T to kill the process.

    You can check the blocking/hanging process in Process Browser for SQL. Kill that as well. The process seems to be lost and needs to be killed.

    The last option is to start the SQL Server service. As you have Replication setup, you can failover to other server database and then restart the service. You must take proper approval before service restart as it will have direct impact on all the applications and all the databases will be down during the restart process. Proper communication and approval is required for any restart of service. Before restart, make sure you bring the database so single user mode to minimise the recovery time.

  • Atul DBA (12/4/2009)


    I am not sure about you OS. You can use the /F option with KILL command to make it forcefully kill the process. If you have TaskKill utility installed, you can use TaskKill /F /T to kill the process.

    There's no way to forcefully kill a session in SQL, it has to roll back or the database would be transactionally inconsistent and would have to be marked suspect. We're not talking about an operating system-level kill of a process (which taskkill kills and for which /F and /T are valid switches), we're talking about a session inside SQL Server.

    Restarting SQL may result in prolonged restart-recovery as the rollback completes there (if this is not enterprise edition with fast recovery).

    As you have Replication setup, you can failover to other server database and then restart the service.

    Replication doesn't allow an easy failover to the other database. If the replicated DB is a complete copy (which is not required), failover's a manual process that required client apps to change connection strings. Failing back's even harder. Are you thinking of database mirroring?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • I was unable to move the DB to simple recovery mode. I was able to restart SQL without a huge recovery. This is very good, as last time I had a problem with this server on a restart, the recovery ran for 3.5 days.

  • I remember saying that (in my initial reply) it would not be possible to change the recovery model when the rollback was running.

    What are error messages SQL Server reported when the instance was started. If you find anything, post it here and would be beneficial to us.

    Have you performed a full backup now?


    Bru Medishetty

    Blog -- LearnSQLWithBru

    Join on Facebook Page Facebook.comLearnSQLWithBru

    Twitter -- BruMedishetty

  • Good to hear this. Please take a full backup of your database. Check that there is no error at IO level. Sharing your learning will be more helpful to others.

  • I did take a full backup, no issues. no errors in any logs.

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply