Reducing Log Shipping Restore Latency

  • Hi,

    We use log shipping in our 2008 R2 databases. On one of our large heavily used databases, transaction log backups are run every two minutes. Each TL backup is around 50MB in size when the primary server is used heavily. We use TDE on the database and backup compression.

    The files are transferred quickly, but the restore job takes a long time during peak times. For example when heavy load is placed on the server, I've noticed the restore job to fall behind by nearly 24 hours. The transaction logs are safely copied to the remote server, so in the event of primary failure, we would be able to restore back, up to the last copied file, but the restore time objective could potentially be 24 hours whilst we wait for the restores to complete.

    Can anyone help with why the restores are taking so long? What factors do I need to consider? The DR server is slightly different spec to production so I am expecting it to be hardware related, but would be good to get other ideas.

  • Is the database in restore mode or Standby\readonly?

    -----------------------------------------------------------------------------------------------------------

    "Ya can't make an omelette without breaking just a few eggs" 😉

  • Log shipping for the database was configured in Standby/readonly mode, but the DB is in restore mode when it is restoring the transaction logs. When it falls behind, because it is restoring all the files in turn, it stays in restoring mode for the duration, until the load reduces and it catches up with it's self. Then it goes back to standby between the normal restore job schedule.

  • As a side note. Logs are taking about 20 minutes to restore for a 50 MB file. The number of changes is around 4800 inserts during the two minute period between log backups.

    Looking at the failover server, CPU is minimal, but SQL Server has hit its MAX memory setting and this seems to be the bottleneck, although not sure why.

  • Restoring with standby can place a LARGE overhead on the restore of each transaction log backup file.

    Leaving the database in standby means that SQL Server has to leave the database transactionally consistent (i.e. no half done transactions). This means undoing any changes made by uncommitted transactions, and writing them away so it can redo them immediately prior to restoring the next transaction log backup.

    As you are backing up the logs every 2 minutes, you will have a bigger chance of having uncommitted transactions spanning log backups, therefore causing a lot of redo work before restoring the next log backup and a lot of undo work after restoring.

    Are you actually using the database for any read-only work? I'm guessing not, given the frequency of backups, as users would have to be kicked off before each backup is restored.

  • Do you absolutely require the secondary database to be restored that regularly, have you thought of deferring the copy and\or the restore jobs to run at low usage periods. The rate you are at the moment i'd be surprised if anyone were able to query the database as they'll constantly be disconnected.

    Please post the results of the following query against the secondary server

    exec sys.sp_help_log_shipping_secondary_database 'yoursecondarydb'

    -----------------------------------------------------------------------------------------------------------

    "Ya can't make an omelette without breaking just a few eggs" 😉

  • Ah! That makes perfect sense when you think about the redo overhead! I am not sure why we have the DB in standby mode. We do not report on it and you're right, with a restore period so small, it is pointless.

    I have changed the database to be in restore mode and would you believe it, each backup file is taking about 40 seconds to restore now! Unbelievable!! Talk about an overhead. From 10 to 20 minutes a file down to less than a minute!

    Thanks all for your help. Definitely learnt something today!

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply