My log file became not available and the database went into Recovery Pending mode

  • Around 1:01 AM in the morning, the drive that houses my log files (1 of 4) filled up.

    The particular database with the large log file was being replicated to a reporting system, as well as 2 other dbs on this server.

    After the flog file filled up, it started throwing error messages about being full. It did this for a few minutes, until it threw this message at 1:05.

    Log entry:Error: 9001, Severity: 21, State: 5.

    The log for database 'ILCentralDataEnt' is not available. Check the event log for related error messages. Resolve any errors and restart the database.

    at 1:14 it reports this

    Log entry:Error: 9002, Severity: 17, State: 6.

    The transaction log for database 'ILCentralDataEnt' is full. To find out why space in the log cannot be reused, see the log_reuse_wait_desc column in sys.databases

    then this

    Log entry:Error: 3314, Severity: 21, State: 1.

    During undoing of a logged operation in database 'ILCentralDataEnt', an error occurred at log record ID (386988:95874:18). Typically, the specific failure is logged previously as an error in the Windows Event Log service. Restore the database or file from a backup, or repair the database.

    I find it int his state around 2:00 am, as I was asleep. After some research, and stopping several other monitoring jobs that were spewing errors, I started looking into steps to remediate. Since the log was full and the db was in recovery pending state, I thought it best to stop the replication distribution job. The log_reuse_wait_desc FROM sys.databases said 'REPLICATION' for this particular database.

    Around 2:26am i found a forum post that suggested I take it offline and back online. I was able to take it offline, which changed its status from Recovery_Pending to offline. When I attempted to bring it online, it errored with

    /*------------------------

    ALTER DATABASE ILCentralDataEnt SET ONLINE

    ------------------------*/

    Msg 945, Level 14, State 2, Line 1

    Database 'ILCentralDataEnt' cannot be opened due to inaccessible files or insufficient memory or disk space. See the SQL Server errorlog for details.

    Msg 5069, Level 16, State 1, Line 1

    ALTER DATABASE statement failed.

    Msg 9002, Level 17, State 6, Line 1

    The transaction log for database 'ILCentralDataEnt' is full. To find out why space in the log cannot be reused, see the log_reuse_wait_desc column in sys.databases

    The error log said the following as well

    Log entry:Error: 928, Severity: 20, State: 1.

    During upgrade, database raised exception 945, severity 14, state 2, address 0000000001330B49. Use the exception number to determine the cause.

    Log entry:Error: 3314, Severity: 21, State: 1.

    During undoing of a logged operation in database 'ILCentralDataEnt', an error occurred at log record ID (386988:95874:18). Typically, the specific failure is logged previously as an error in the Windows Event Log service. Restore the database or file from a backup, or repair the database.

    After talking to Gail Shaw over twitter, it was suggested that I write down this diatribe here in the hopes that she and others could help.

    I also turned the distribution job back on, which had not errored, but I figured i was hosed and had turned off during a previous step.

    So turning to the drive where the logs are hosted, I find that there is no room, but one of the logs could be shrinked up. I have done this, which has made 10G free where the logs are. I wish i had added a second log file to this database, as in the past, when its log has filled up, i was able to add a second log and get it working again. I do not know how to do this when its in either Offline or Recovery_Pending state. I do not think this is possible. But state it as i am thinking outloud here.

    Yes, this database has filled up its log before, and was still online at the time, simply spewing full messages. I was able to add a second log on another drive, and get it to continue processing after that. This is a new state it has reached, one i have not troubleshot before.

    current log_reuse_wait_desc FROM sys.databases is 'REPLICATION'

    Database is in REcovery_Pending state

    replication distribution job is running

    drive where log files reside has 10G free

    log file is sitting at 93G

    error log was DMed to gail for perusal

  • could you please add @@version information of that instance ?

    Gail has way more recovery experience than me, so she may come up with less drastic measures than I might suggest.

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • tjaybelt (12/24/2013)


    Log entry:Error: 3314, Severity: 21, State: 1.

    During undoing of a logged operation in database 'ILCentralDataEnt', an error occurred at log record ID (386988:95874:18). Typically, the specific failure is logged previously as an error in the Windows Event Log service. Restore the database or file from a backup, or repair the database.

    And that's the bummer.

    It may be a bug where SQL didn't reserve enough space to roll back. Shouldn't ever happen, but there could be a bug. Try freeing up some space and trying again to bring the DB online. From the error log it's running out of space during the undo phase of recovery, which shouldn't ever happen. You may need to move one of the other logs to another drive temporarily to make space.

    If you get the same errors when there is enough space (error during undo/redo), locate your backups and start restoring (or emergency mode repair, but I'd recommend the restore).

    btw, you seem to have a problem with Distribution too. That may be why the replication wait, if the log reader agent wasn't working properly, log would have grown as it would not be able to reuse space.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • ALZDBA (12/24/2013)


    could you please add @@version information of that instance ?

    Microsoft SQL Server 2008 R2 (SP2) - 10.50.4000.0 (X64)

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • well, after some much needed sleep, and some review by the goddess of databases, i awoke to a few messages (see above) from Gail that indicated i could try 1 thing, after making more room, or start over. Obviously the first task would be better on my damaged soul.

    I had previously freed up 5G of space, and then this morning i freed up some more, as much as I could, (it was about 12G in total) and told the DB to come online. As I watched the disk to see if it would report being consumed (it did not) I also watched the log by manically refreshing it until it finished restoring the DB. The DB came online, DBCC checkdb finished successfully and it picked up where it left off, chugging through its log file and replicating data elsewhere.

    I added another log file onto another lun that has more room (so i think now, similar to my thinking before with the lun where the log resided). So there are now 2 log files, one with the potential to outgrow its container, and the other less so.

    Replication is chugging along, data is moving, things seem to be ok at the moment.

    Review of our processes that allow such a massive amount of data movement need to occur. But that may need to wait for another day. Bandaids are in place now, and fingers are crossed that it will all continue to go well. I will continue to monitor it until it is done doing its stuff, settles down to normalcy and then my heart rate can go back down.

    Until then, fingers crossed.

  • sooooo many THANKS go out to Gail for looking at this problem and giving a course of action that has helped me out.

    You have gone above and beyond in your duties and I owe you so any thanks for your time, knowledge, effort and willingness to share them with me. It's a Festivus miracle!! If it would make it up to you, i owe you some massive thank you hugs next time we see each other.

  • I have just had this issue where the database decided it would take its self into a "unavailable" state, after freeing up enough space on the LOG drive the database automatically sorted its self out and brought its self back online.

    I followed it up with a checkDB, transaction log backup, shrink of the log file and a full backup.

    I had the issue in SQL 2008 Standard

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply