Why transaction log cannot be trucnated during full backup

  • Hello,

    I am trynig to find out what are technical reasons or is it only design chosen by Microsoft, that transaction log cannot be truncated during full backup. I mean not full backup truncating log, but if during execution of full backup, additionally transaction log is executed, it does not truncate log. This sometime causes filling up of disk in transaction log drive. Transaction log is being backed up independently from full backup - full backup only includes part of transaction log needed to ensure consistency.

    So only reason why it is not truncated seems to be that if we would like to truncate log - full backup would need to separatelly log database transactions during while it is running and this would require even more space for log (possibly not double the size of transaction log as it needs only part) and it would be not optimal.

    Hope someone has some technical knowlege or insights on this topic.

    1. Full Database Backups (SQL Server): "A full database backup backs up the whole database. This includes part of the transaction log so that the full database can be recovered after a full database backup is restored. Full database backups represent the database at the time the backup finished."
    2. What recovery model are you using? What is your backup schedule?
    3. You need to solve the root problem rather than just reactively truncating transaction log. It sounds like you need

      1. Bigger drive for log file(s); and/or
      2. More frequent log file backups (full recovery) or full backups if not; and/or
      3. Tune transactions so that they don't generate such large transactions (eg., batching)

    Obviously, there may be rare emergencies when we may have to truncate/shrink log files. But don't try to do it during full backup.

  • I am using full backup, it is done weekly, diff daily and transaction log backups are done every two hours. It is ok for most of servers, but on some there there are large databases - few terabytes in size, full backup might run for 10, 15 or more hours. Then for that amount of time SQL server does not allow transaction log to be truncated. About suggestions:

    1. Bigger drive would solve issue, but it is sometime difficult to predict what is right sizefor drive in worst case (full backup runs and some activity generates lots of log) on each server and we have lots of them in my company
    2. More frequent log backup is irrelavant here as full backup prevents truncation of log while it is running, even if transaction log backup is done at same time
    3. Same as 2 point - if full backup runs for many hours it will still generate lots of log, even if transactions are smaller

    But basically in this question I want to understand technically why SQL server is designed in this way, rather than to find fix for some operational issue. As for example as I know - might be wrong, that in oracle server you can do full backup without including log backup. So log can be backed up and truncated while full backup is running, but of course full backup would be inconsisten without doing log backup after full backup is done.

    In SQL there is only 1 option - to make full backup with log backup included. So my understanding is that to truncate log, during full backup there would be 2 technical options;

    1. Like in oracle, to have flexibility not to include log in full backup (leaving full backup inconsisten)
    2. Implementing some separate transaction logging during full backup, which would require some more space, but would allow to truncate transaction log

    But none of these options are implemented in SQL server for some reason and I am trying to understand why.

     

  • To clarify, we are talking about running out of file space for the log files (ldf) and not the backups, correct? If so, then more frequent log backups could help -- your log file grows if it has no space for transactions that have not yet been backed up -- more frequent backups => fewer transactions in the queue. Log backups can occur while full backups are running.

    Are we talking about running out of space only rarely (once a month or less)? Or frequently/continually?

    Have you checked for open transactions w/ DBCC OPENTRAN? Open transactions will prevent not just shrinking of the log file, but reuse of space.

    The problem may also be the size/duration of transactions. Do you have continual large transactions? What is the duration of those large transactions? Do you have applications that (perhaps unnecessarily) keep transactions open for a long time?

    Do you have a relatively large amount of data that is historical or static? It might be that you could take advantage of filegroup backups and only back up contemporary data.

    Are backups compressed?

    Are you using multi-file full backups? Parallelizing the backups could reduce backup time significantly... if you have the bandwidth... which leads to the question: How is I/O and drive speed? Have you benchmarked those to determine if they are adequate or if there are network or hardware problems? 10-15 hours seems extremely slow for  a few terabytes. Are you backing up to a SAN?

    I'll leave the whys to someone else.

     

  • SQL Server cannot truncate the log until after the backup completes - because that is the last phase of the backup process.  Once the database has been backed up, SQL Server can then identify what needs to be included from the transaction log to ensure that recovery will be able to roll forward - or roll back transactions that were in process at the completion of the full backup.

    The core problem you have is that your backups are taking 10+ hours to complete.  That is the problem you need to resolve - and there are many possibilities to improve that performance.  For example, if you are backing up across the network - then you need to look at creating a dedicated backup network and make that network as fast as possible.  If you are not setting BUFFERCOUNT and MAXTRANSFERSIZE on your backups - then you need to test and evaluate what settings would improve the performance.

    If nothing else can be done to improve the performance, then you probably should look at striping the backups - and if that still doesn't get you to a reasonable timeframe, then a SAN based solution may be the better option.  A good example is IBM's SPP product - which utilizes a snapshot technology to snap the backup to a separate iSCSI volume, compresses and dedupes the snap - and then only backs up the delta (they call it full backup - incremental forever).

    FWIW - I backup a 6TB+ database every night in less than 2 hours.  That process backs up to a local drive - then copies the file to another server.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • Answering to ratbak - just to clarify - yes transaction log backups can be done when full backup is running, but log file is not truncated. If I query sys.database in log_reuse_wait_desc there is value of ACTIVE_BACKUP_OR_RESTORE. Here is also explanation about this - https://sqlity.net/en/1835/log-reuse-waits-explained-active_backup_or_restore/

    About space - I am talking about log file space, we have separate log drive which gets filled.

    Backups are compressed, they are also throttled to some level as we have shared infrastructure with few thousand SQL servers and even more databases. Therefore speed of backups cannot be increased. Also archiving of data would be an option, but it cannot be implemented in all cases.

    Thanks for your replies with suggestions!

  • @Jeffrey - we have shared infrastructure with thousands of sql servers, backups have dedicated network, but backups are throttled not to overutilize network and storage as backups run on same time on most of servers. I will check with backup team if tuning BUFFERCOUNT and MAXTRANSFERSIZE would help, we will try some tests. Of course also we might consider unthrottling backups for these huge databases, while keeping throttle on smaller ones, where transaction log disks does not get filled during backup.

    "FWIW - I backup a 6TB+ database every night in less than 2 hours." - wow this us superfast - we have speed of backup/restore around 200-400 Gb/hour therefore for multi TB databases it takes 10-15 hours or more to backup. On some it is not an issue to have long running backup as there is no high activity, but some are active and generate a lot of longs.

    Thanks for idea of IBM's SPP.

  • Jeffrey Williams wrote:

    FWIW - I backup a 6TB+ database every night in less than 2 hours.  That process backs up to a local drive - then copies the file to another server.

    I'm curious... How long does it take to do the actual backup and how long does it take to do the copying?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden wrote:

    Jeffrey Williams wrote:

    FWIW - I backup a 6TB+ database every night in less than 2 hours.  That process backs up to a local drive - then copies the file to another server.

    I'm curious... How long does it take to do the actual backup and how long does it take to do the copying?

    I haven't reviewed the copy process recently - it is a separate process we job off once that database has completed its backup and the agent job is deleted once it completes successfully.  So the actual backup time is about 1 hour 21 minutes for that one database.  It can take a couple of hours to move the file to the other system.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • Guys, thanks for you answers and considerations - but we have moved to looking for a practical solution on how not to fill disks. But what I am really looking for is just theoretic answer on why in SQL server transaction log cannot be truncated when backed up while full backup is running. As i mentioned in previous posts - in oracle full is done without including log backups, so it can be truncated. But it is not the case in SQL as full includes transaction log. But also I thought - why inactive VLF could not be truncated during full backup, if it is already backed up? Maybe it could be theoretically implemented? SQL would just need to track which inactive VLFs are already backed up.

  • Tadas wrote:

    @Jeffrey - we have shared infrastructure with thousands of sql servers, backups have dedicated network, but backups are throttled not to overutilize network and storage as backups run on same time on most of servers.

    If your backups are on a "dedicated network", I don't understand the need to throttle the backups to not "overutilize" the network.  If you made the backups run faster, then you might even have time to stagger the backups to prevent overustilizing the storage.  If the storage is on high performance SSDs, there's not actually much to worry about there so long as you have enough "connections".

    Tadas wrote:

    I will check with backup team if tuning BUFFERCOUNT and MAXTRANSFERSIZE would help, we will try some tests. Of course also we might consider unthrottling backups for these huge databases, while keeping throttle on smaller ones, where transaction log disks does not get filled during backup.

    Have you done that yet?  Along with a good backup network and storage, Jeff Williams an I have both been able to attain rates of about 3TB per hour only by changing adding those two settings with appropriate values.

    As for the issue of log file backups not truncating the log file during backups, is it that or is it long winded transactions that would stuff the transaction logs regardless of what else is running but just happen to be running while you're doing the FULL backups.  I don't know for sure because I've not had the issue (knock on wood) but do YOU know for sure?  To be sure, I'm NOT doubting your integrity in any way... I'm just trying to make sure that you consider a possible alternative issue.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply