Slow Transaction Log Restores

  • Hi!
    I'm investigating into some slow transaction log restores - mostly they are from the larger transaction log files created by index maintenance or consistency check tasks. Some of the files are a few gig and take 50+ minutes to restore (just the restore part only).

    I have looked at buffercount & maxtransersize but none of these options seem to help the restore time of the transaction log file. I have read up about Instant File Initialization - and I was wondering would that help transaction log backup restore times? I understand it doesnt affect the log file - but would it help replay into the DB itself on a standy/readonly database?

    Any other suggestions? I've also looked into VLDFs as well.

    Thanks!

  • How frequently are you taking the log backups, and can you improve the frequency so that you have more, (usually) smaller files?

  • You can turn on a trace flag for a verbose logging for restores and that can often tell you what process is taking the most time. To get verbose output to the log, run this prior to the restore:
    DBCC TRACEON(3004, 3605, -1)

    Sue

  • Hi!
    I'm backing up every 10 minutes - however actions like index maintenance, etc take a while - for instance between 12am and 2am we are about 30-40 minutes behind due to the fact that there is a 183MB log file at 12:10am and two more (183mb @ 1:40 and 195mb @ 1:50) - normally the log files are anywhere from 20mb to 50mb & take anywhere from 1-5 minutes to restore.

    I enabled the trace flags - this is for one log file restored (changed dbname and path)

    Date,Source,Severity,Message
    06/28/2017 10:37:46,spid66,Unknown,Setting database option MULTI_USER to ON for database InsertDBNameHere.
    06/28/2017 10:37:46,spid66,Unknown,RestoreLog: Finished
    06/28/2017 10:37:46,spid66,Unknown,Restore: Done with MSDB maintenance
    06/28/2017 10:37:46,spid66,Unknown,Writing backup history records
    06/28/2017 10:37:46,Backup,Unknown,Log was restored. Database: InsertDBNameHere<c/> creation date(time): 2013/01/09(09:39:15)<c/> first LSN: 15277:2727113:1<c/> last LSN: 15277:2939295:1<c/> number of dump devices: 1<c/> device information: (FILE=1<c/> TYPE=DISK: {'\\HWDRMONSQL01\SQLServerLogShipping_OM\InsertDBNameHere\InsertDBNameHere_backup_20170628103000.trn'}). This is an informational message. No user action is required.
    06/28/2017 10:37:46,spid66,Unknown,Restore: Writing history records
    06/28/2017 10:37:46,spid66,Unknown,Resuming any halted fulltext crawls
    06/28/2017 10:37:46,spid66,Unknown,Database is in STANDBY
    06/28/2017 10:37:46,spid66,Unknown,CHECKDB for database 'InsertDBNameHere' finished without errors on 2017-06-27 23:30:08.893 (local time). This is an informational message only; no user action is required.
    06/28/2017 10:37:46,spid66,Unknown,Starting up database 'InsertDBNameHere'.
    06/28/2017 10:37:46,spid66,Unknown,Recovery completed for database InsertDBNameHere (database ID 7) in 1 second(s) (analysis 91 ms<c/> redo 20 ms<c/> undo 563 ms.) This is an informational message only. No user action is required.
    06/28/2017 10:37:45,spid66,Unknown,Recovery is writing a checkpoint in database 'InsertDBNameHere' (7). This is an informational message only. No user action is required.
    06/28/2017 10:37:44,spid66,Unknown,Zeroing completed on F:\SQLLogs\InsertDBNameHere_log.ldf
    06/28/2017 10:37:44,spid66,Unknown,Zeroing F:\SQLLogs\InsertDBNameHere_log.ldf from page 2583627 to 2584107 (0x4ed896000 to 0x4edc56000)
    06/28/2017 10:37:44,spid66,Unknown,FixupLogTail(progress) zeroing F:\SQLLogs\InsertDBNameHere_log.ldf from 0x4ed895e00 to 0x4ed896000.
    06/28/2017 10:37:44,spid66,Unknown,The database 'InsertDBNameHere' is marked RESTORING and is in a state that does not allow recovery to be run.
    06/28/2017 10:37:44,spid66,Unknown,Starting up database 'InsertDBNameHere'.
    06/28/2017 10:37:44,spid66,Unknown,Transitioning to STANDBY
    06/28/2017 10:37:44,spid66,Unknown,Restore: Done with fixups
    06/28/2017 10:37:44,spid66,Unknown,Rollforward complete on database InsertDBNameHere
    06/28/2017 10:35:19,spid66,Unknown,Restore-Redo begins on database InsertDBNameHere
    06/28/2017 10:35:19,spid66,Unknown,Restore: Backup set restored
    06/28/2017 10:35:19,spid66,Unknown,Restore: Data transfer complete on InsertDBNameHere
    06/28/2017 10:35:19,spid66,Unknown,FileHandleCache: 0 files opened. CacheSize: 200
    06/28/2017 10:35:01,spid66,Unknown,Restore: LogZero complete
    06/28/2017 10:35:01,spid66,Unknown,Restore: Waiting for log zero on InsertDBNameHere
    06/28/2017 10:35:01,spid66,Unknown,Restore: Transferring data to InsertDBNameHere
    06/28/2017 10:35:01,spid66,Unknown,Restore: Restoring backup set
    06/28/2017 10:35:01,spid66,Unknown,Restore: Containers are ready
    06/28/2017 10:35:01,spid66,Unknown,Restore: PreparingContainers
    06/28/2017 10:35:01,spid66,Unknown,Restore: Finished undoing STANDBY for InsertDBNameHere
    06/28/2017 10:35:01,spid66,Unknown,Zeroing completed on F:\SQLLogs\InsertDBNameHere_log.ldf
    06/28/2017 10:35:01,spid66,Unknown,Zeroing F:\SQLLogs\InsertDBNameHere_log.ldf from page 2570366 to 2570377 (0x4e70fc000 to 0x4e7112000)
    06/28/2017 10:35:01,spid66,Unknown,FixupLogTail(progress) zeroing F:\SQLLogs\InsertDBNameHere_log.ldf from 0x4e70fb200 to 0x4e70fc000.
    06/28/2017 10:35:01,spid66,Unknown,SnipEndOfLog from LSN: (15277:2727113:1)
    06/28/2017 10:35:01,spid66,Unknown,Restore: Undoing STANDBY for InsertDBNameHere
    06/28/2017 10:35:01,spid66,Unknown,Restore: BeginRestore (offline) on InsertDBNameHere
    06/28/2017 10:35:01,spid66,Unknown,Restore: Planning complete
    06/28/2017 10:35:01,spid66,Unknown,Dismounting FullText catalogs
    06/28/2017 10:35:01,spid66,Unknown,Halting FullText crawls on database InsertDBNameHere
    06/28/2017 10:35:01,spid66,Unknown,Restore: Planning begins
    06/28/2017 10:35:01,spid66,Unknown,Restore: Backup set is open
    06/28/2017 10:35:01,spid66,Unknown,Restore: Configuration section loaded
    06/28/2017 10:35:01,spid66,Unknown,Opening backup set
    06/28/2017 10:35:01,spid66,Unknown,X-locking database: InsertDBNameHere
    06/28/2017 10:35:01,spid66,Unknown,RestoreLog: Database InsertDBNameHere
    06/28/2017 10:35:01,spid66,Unknown,Setting database option SINGLE_USER to ON for database InsertDBNameHere.

    The log file was 34MB in size. It looks like its the Restore-Redo stage thats taking the longest - any way of "speeding" that part up?

    Thanks!

  • It is the redo phase taking the most time but it isn't taking 50+ minutes. That's just under three minutes.
    If you wanted to speed up restores, I would look at the VLFs. I would also check if autogrowth is enabled and if it is, what is the growth increment. And it shouldn't be percent. 

    Sue

  • Thanks Sue.
    I know that its not a huge amount of time (I traced during the day which is when the log files are smaller), but it does get larger at night times (thats when it gets to 50+ minutes) - I'll need to keep the trace enabled during that time to capture whats going on then - this was just during a normal restore during the day with less activity.
    I've checked VLFs and it only has 16. MDF is 70GB, LDF is 30GB. Set to autogrow by 500MB increments.

    Any other ideas?

    Thanks!
    Melissa

  • This is for log shipping, correct?
    If so, did you see this post:
    https://www.sqlservercentral.com/Forums/1103972/extremely-slow-log-shipping-restore-in-standby-mode?PageIndex=2

    The "best" solution that I see from there was from dpaproskijr who said:
    we never did find a solution other than switching to "no recovery mode" and then back to standby once it was caught up or just backing up the source database and reestablishing the log shipping from scratch again. Fortunately, both disk and network speed have improved for us so much over the intervening years, starting again from scratch is pretty quick and easy now. 

    Also, if I understand what your log shows and what adb2303 says in the above post, that is all reading from and writing to a single VLF:
    If so, can you work out how many VLF's are being restored from the log backup: last LSN - first LSN? 

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • It wasn't mentioned but I saw that path reference to \SQLServerLogShipping_OM and the database being in standby as well.
    The switching around recovery is due to avoiding the likelihood of a large undo file. Some large, long running transaction in the log from the primary needing to be rolled back. If you read that post, that really looks like what they were working around - waited until it all catches up or rebuild log shipping. You'd really want to check the size of the undo file and also check what was going on with the primary when the log backup was taken, e.g. index maintenance, things like that. And where the undo file is written to can also cause a problem when it's large. But again, it's usually the size that will tip you off on that. And you will almost always have at least two slow restores in a row as the next one will need to read the undo file and undo to get it to the right point as if there wasn't a with standby done. Not sure if you can see more of that going on if you change the trace flag to 3014, if you see anything with the undo file...another thing to play with.
    But this one was a wimpy one. 

    Sue

  • Thanks for that Sue!
    By default we allow for STANDBY but in one of the cases I have, we dont need the database in STANDBY and NORECOVERY is enough. That has speed up the restore time by double in testing (during the day though), so I will see how it goes overnight.

    I think it will be a case of case-by-case tweaks and seeing what works in each situation. Thankyou all soo much for your input!

  • melc - Wednesday, June 28, 2017 6:52 PM

    Thanks for that Sue!
    By default we allow for STANDBY but in one of the cases I have, we dont need the database in STANDBY and NORECOVERY is enough. That has speed up the restore time by double in testing (during the day though), so I will see how it goes overnight.

    I think it will be a case of case-by-case tweaks and seeing what works in each situation. Thankyou all soo much for your input!

    You probably should look at the undo file - size and location. The undo file is likely the issue based on what you have seen so far.

    Sue

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply