SQL Restore takes long time - how do I clear committed transactions from log ?

  • Hi everyone,

    There is so many blog posts and articles out there about this case and I have been reading, but I am still confused. I restored a database and it took more than 12+ hours to complete. I noticed that the transaction log was more than 500GB, so I am guessing the redo/undo phase was very long. Database is in full recovery mode.

    If the transaction log is 500GB, but its has 490GB of free space. Does it mean that I have only 10GB of committed/uncommitted transactions?
    If the transaction log is 500GB and it has no free space, does it mean that I have a lot of committed/uncommitted transactions in the log file? If yes, should I just take a backup of the transaction log to clear the committed/uncommitted transactions ? And then take a full backup.

    Thank you

  • Gamleur84 - Wednesday, December 6, 2017 12:32 PM

    Hi everyone,

    There is so many blog posts and articles out there about this case and I have been reading, but I am still confused. I restored a database and it took more than 12+ hours to complete. I noticed that the transaction log was more than 500GB, so I am guessing the redo/undo phase was very long. Database is in full recovery mode.

    If the transaction log is 500GB, but its has 490GB of free space. Does it mean that I have only 10GB of committed/uncommitted transactions?
    If the transaction log is 500GB and it has no free space, does it mean that I have a lot of committed/uncommitted transactions in the log file? If yes, should I just take a backup of the transaction log to clear the committed/uncommitted transactions ? And then take a full backup.

    Thank you

    In essence yes that is correct, regular log backups should be made to ensure you can meet your companies RTO and RPO as well as keep sizes manageable.

    If you have a 500GB log and its full back up the log first, then do a full backup.

    If you have restored this to a new server the restore command has to create the database structure as is in the full backup, so it will still have to create a 500GB log file even thought if 490GB is empty it still creates the file at 500GB not 10GB.  Again another reason to do regular log backups to stop the file size getting to much out of control.

    That being said, you may want to profile the wait stats when restoring, your probably getting a lot of IO waits, I would be inclined to see if you have "Instant File Initialization" enabled and try the restore again with a completely new database see if that helps with reducing the 12 hour restore time.

  • anthony.green - Thursday, December 7, 2017 1:13 AM

    Gamleur84 - Wednesday, December 6, 2017 12:32 PM

    Hi everyone,

    There is so many blog posts and articles out there about this case and I have been reading, but I am still confused. I restored a database and it took more than 12+ hours to complete. I noticed that the transaction log was more than 500GB, so I am guessing the redo/undo phase was very long. Database is in full recovery mode.

    If the transaction log is 500GB, but its has 490GB of free space. Does it mean that I have only 10GB of committed/uncommitted transactions?
    If the transaction log is 500GB and it has no free space, does it mean that I have a lot of committed/uncommitted transactions in the log file? If yes, should I just take a backup of the transaction log to clear the committed/uncommitted transactions ? And then take a full backup.

    Thank you

    In essence yes that is correct, regular log backups should be made to ensure you can meet your companies RTO and RPO as well as keep sizes manageable.

    If you have a 500GB log and its full back up the log first, then do a full backup.

    If you have restored this to a new server the restore command has to create the database structure as is in the full backup, so it will still have to create a 500GB log file even thought if 490GB is empty it still creates the file at 500GB not 10GB.  Again another reason to do regular log backups to stop the file size getting to much out of control.

    That being said, you may want to profile the wait stats when restoring, your probably getting a lot of IO waits, I would be inclined to see if you have "Instant File Initialization" enabled and try the restore again with a completely new database see if that helps with reducing the 12 hour restore time.

    If I recall correctly, IFI doesn't help with the transaction log, SQL is still going to create the file, then zero out every sector of that file (whereas IFI does help with the data files, which SQL does *NOT* zero out.)
    So, a 1/2 terabyte file is going to take a long time to zero out, even on fast storage...

    So it might be worthwhile to find out *why* the log is 500GB, then look into what can be done to get it, and keep it, smaller (if you have to do restores on a regular basis.)  That might mean increasing the frequency of log backups, finding transactions (data loads, etc) that bloat the log and tuning those, or other changes.

  • jasona.work - Thursday, December 7, 2017 6:04 AM

    anthony.green - Thursday, December 7, 2017 1:13 AM

    Gamleur84 - Wednesday, December 6, 2017 12:32 PM

    Hi everyone,

    There is so many blog posts and articles out there about this case and I have been reading, but I am still confused. I restored a database and it took more than 12+ hours to complete. I noticed that the transaction log was more than 500GB, so I am guessing the redo/undo phase was very long. Database is in full recovery mode.

    If the transaction log is 500GB, but its has 490GB of free space. Does it mean that I have only 10GB of committed/uncommitted transactions?
    If the transaction log is 500GB and it has no free space, does it mean that I have a lot of committed/uncommitted transactions in the log file? If yes, should I just take a backup of the transaction log to clear the committed/uncommitted transactions ? And then take a full backup.

    Thank you

    In essence yes that is correct, regular log backups should be made to ensure you can meet your companies RTO and RPO as well as keep sizes manageable.

    If you have a 500GB log and its full back up the log first, then do a full backup.

    If you have restored this to a new server the restore command has to create the database structure as is in the full backup, so it will still have to create a 500GB log file even thought if 490GB is empty it still creates the file at 500GB not 10GB.  Again another reason to do regular log backups to stop the file size getting to much out of control.

    That being said, you may want to profile the wait stats when restoring, your probably getting a lot of IO waits, I would be inclined to see if you have "Instant File Initialization" enabled and try the restore again with a completely new database see if that helps with reducing the 12 hour restore time.

    If I recall correctly, IFI doesn't help with the transaction log, SQL is still going to create the file, then zero out every sector of that file (whereas IFI does help with the data files, which SQL does *NOT* zero out.)
    So, a 1/2 terabyte file is going to take a long time to zero out, even on fast storage...

    So it might be worthwhile to find out *why* the log is 500GB, then look into what can be done to get it, and keep it, smaller (if you have to do restores on a regular basis.)  That might mean increasing the frequency of log backups, finding transactions (data loads, etc) that bloat the log and tuning those, or other changes.

    Ah yes I always forget that part and the documentation is a little hazy around the log file, as the top of the doc says it works for logs but then you read further and it says it doesn't as logs are always zeroed out when created or grown, how I love the consistency of Microsoft Docs.

  • In addition to taking a log backup and looking into why the log is that large,  I would also want to look at the number of VLFs as that can impact the recovery time.
    If you want to be certain on what phase of the restore is taking the most time, you can set a trace flags for verbose logging of the restore prior to doing the restore:
    DBCC TRACEON(3004, 3605, -1)

    Sue

  • Hi guys, thank you for the answers.

    I am not concerned about the 500GB empty file. I am concerned about the amount of committed/uncommited transaction in the log file.Let me give more advanced details on this case because I think it is very interesting.

    The log file is 500GB, but it is 99% empty, but here is the interesting part: I have 600 VLF files in the transaction log file and all of them have a status 2 (active). I thought I was crazy, how can my log file be empty and have all the VLF files marked as active? I found the reason here: https://blogs.msdn.microsoft.com/sql_pfe_blog/2013/06/27/lazy-log-truncation-clearing-of-sql-transaction-log-vlf-status-deferred/.

    In short, my transaction log file is empty (truncated), but all VLF files are active.

    How does it effect the restore ? Will the REDO/UNDO phases go through all VLF files because they are marked as active?

  • Having too many VLFs primarily affects the discovery portion of the recovery process as it will read every VLF one at a time whether or not there are active transactions to be processed. So it reads every VLF one by one in a single threaded fashion no matter what the status.

    Sue

  • keep the log large enough to support transactions in between log backups ( when in full recovery ) but small enough to facilitate a sensible restore time.
    IFI will initialise data files quickly but the log is always zeroed

    -----------------------------------------------------------------------------------------------------------

    "Ya can't make an omelette without breaking just a few eggs" 😉

  • Gamleur84 - Thursday, December 7, 2017 11:50 AM

    I am not concerned about the 500GB empty file.

    You should be.  Even with a log file in good shape insofar as VLFs, it takes a long time to format even with "Instant File Initialization" enabled.  If it turns out that you need a 500GB log file, then you may have a serious problem with some code and a couple of other things.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Was this database part of an availability group?  There's a thing about lazy recycling of transaction log VLFs that might be what's causing problems / concerns.  https://blogs.msdn.microsoft.com/sql_pfe_blog/2013/06/27/lazy-log-truncation-clearing-of-sql-transaction-log-vlf-status-deferred/

    Thomas Rushton
    blog: https://thelonedba.wordpress.com

  • Hey Thomas,

    Yes it is! That's is my problem, do you know how it affects the backup/restore process ?

  • Jeff Moden - Wednesday, December 13, 2017 11:04 AM

    Gamleur84 - Thursday, December 7, 2017 11:50 AM

    I am not concerned about the 500GB empty file.

    You should be.  Even with a log file in good shape insofar as VLFs, it takes a long time to format even with "Instant File Initialization" enabled.  If it turns out that you need a 500GB log file, then you may have a serious problem with some code and a couple of other things.

    I think you need to look at how large and busy that database is before saying a 500GB transaction log is too large.  If you have a 10TB system where several tables are larger than 500GB then a 500GB transaction log wouldn't even be large enough for the index rebuild operations.  Even if the database is smaller - but you have one or more tables that are of significant size you could require that much transaction log space for normal operations.

    The total number of VLF's - if built out prior to SQL Server 2014 - and grown in 8GB sections would actually be quite a bit more than 600 VLF's.  If grown out on 2014 or later the algorithm has changed and you could have a lot less VLF's.  I wonder what a good recommendation will be for systems on 2014 and later - should we reset auto-growth and grow out transaction logs at 512MB - 1GB - or keep the recommended 8GB sizes?

    As for the OP's issue - I wonder if the AOAG is setup as synchronous or asynchronous and how far behind the secondaries were at the time of the backup.  If the redo queue on the secondaries was backed up significantly and the secondary is synchronous, then I could see a very long restore operation because SQL Server would have to rollback all uncommitted and unhardened transactions.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • Jeffrey Williams 3188 - Thursday, December 14, 2017 11:43 AM

    Jeff Moden - Wednesday, December 13, 2017 11:04 AM

    Gamleur84 - Thursday, December 7, 2017 11:50 AM

    I am not concerned about the 500GB empty file.

    You should be.  Even with a log file in good shape insofar as VLFs, it takes a long time to format even with "Instant File Initialization" enabled.  If it turns out that you need a 500GB log file, then you may have a serious problem with some code and a couple of other things.

    I think you need to look at how large and busy that database is before saying a 500GB transaction log is too large.  If you have a 10TB system where several tables are larger than 500GB then a 500GB transaction log wouldn't even be large enough for the index rebuild operations.  Even if the database is smaller - but you have one or more tables that are of significant size you could require that much transaction log space for normal operations.

    The total number of VLF's - if built out prior to SQL Server 2014 - and grown in 8GB sections would actually be quite a bit more than 600 VLF's.  If grown out on 2014 or later the algorithm has changed and you could have a lot less VLF's.  I wonder what a good recommendation will be for systems on 2014 and later - should we reset auto-growth and grow out transaction logs at 512MB - 1GB - or keep the recommended 8GB sizes?

    As for the OP's issue - I wonder if the AOAG is setup as synchronous or asynchronous and how far behind the secondaries were at the time of the backup.  If the redo queue on the secondaries was backed up significantly and the secondary is synchronous, then I could see a very long restore operation because SQL Server would have to rollback all uncommitted and unhardened transactions.

    I currently work with a couple 1TB tables and several databases that are approaching or have exceeded the 1TB mark. It would be stupid for me to make such comments unless I had.  I'll stick with the 500GB log file being too large because rebuilding even a Clustered Index on a 1TB table will not require that much log file IF you do things "correctly" ("differently" than most would consider to be "correctly) considering the size of what you're dealing with.  Rebuilding a CI on a 1TB monolithic table will also blow out the MDF file with 1TB of unused space unless you're careful there, as well.  If you have such a large table and are relegated to using the Standard Edition, consider a redesign (or purchase the "expensive" edition)  so that you don't have such problems. 😉

    Of course, if one designs the table and clustered index correctly and then uses it correctly, you should never need to rebuild it. 😉

    As a bit of a sidebar, it would be nice if Microsoft would get rid of both REBUILD and REORGANIZE and combine the best of both methods in a MAINTAIN option.  It would also be handy if they came to the realization that logical fragmentation matters little and only in specific cases with the way SANs work today.  Page density is much more important than logical fragmentation.  It would also help people a whole lot if they understood that logical fragmentation has absolutely no effect on singleton lookups.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply