extremely slow log shipping restore in standby mode

  • Hi, how many VLF's in your transaction log?

    from master db, run

    DBCC LOGINFO (DBNAME)

  • Thanks for the response.

    There are 147 VLFs. I checked at the time and 147 didn't seem like too many. It's the same number on both the fast and slow server which makes sense I think.

  • was there any additional information written to the log after the bit you posted? I.e. something about 'log was restored.... first LSN: xxxx last LSN' ?

    If so, can you work out how many VLF's are being restored from the log backup: last LSN - first LSN?

  • 2011-04-26 19:13:13.18 Backup Log was restored. Database: QAI, creation date(time): 2008/02/12(11:37:56), first LSN: 24837:114963:1, last LSN: 24922:13456:1, number of dump devices: 1, device information: (FILE=1, TYPE=DISK: {'E:\SQLBACKUPS\QAI_logshipping\QAI_20110415214601.trn'}). This is an informational message. No user action is required.

    I think the calculation would be 24922-24837 = 85

  • 6 hours does look like an extremely long time to create the undo file, even if the entire 5.5 GB contains uncommitted transactions. Can you compare the restore speed against the other server? You can check in msdb..restorehistory for the time the restore started, then look in the SQL Server log for the time the restore completed.

    SQL BAK Explorer - read SQL Server backup file details without SQL Server.
    Supports backup files created with SQL Server 2005 up to SQL Server 2017.

  • Ray Mond (5/6/2011)


    6 hours does look like an extremely long time to create the undo file, even if the entire 5.5 GB contains uncommitted transactions.

    I agree it does seem an extremely long time, but... my experience of restoring with standby has been similar, and I resorted to using standby only for the last log file of a bunch.

    If the whole 5.5 GB contains uncommitted transactions, some of which were left over from the previous log file, then SQL Server has to re-apply the partial transactions from the previous undo file, then restore the log and apply the 5.5 GB of transactions, then roll back those same 5.5 GB worth of transactions and write them to a new undo file.

    As we've seen from some other threads on this site, rolling back a large transaction takes an order of magnitude longer than making the changes in the first place.

  • I've got the contents of the restorehistory records for the .trn file listed below...I tried to format it but I couldn't get a nice display to work in using this HTML format. To summarize, everything looks exactly the same between the two servers but the slow server starts at 9:56 and finishes at 19:13, the fast server starts at 18:47 and finishes at 19:48. I joined the restorehistory up with the log file based on the start and end LSN...the way the file name is offset due to the timezone caused a bit of confusion.

    From the fast server log file:

    2011-04-16 19:48:11.68 Backup Log was restored. Database: QAI, creation date(time): 2008/02/12(11:37:56), first LSN: 24837:114963:1, last LSN: 24922:13456:1, number of dump devices: 1, device information: (FILE=1, TYPE=DISK: {'f:\logshipping\qai\QAI_20110415214601.trn'}). This is an informational message. No user action is required.

    From msdb..restorehistory join backupset

    Slow sever:

    restore_history_id1909401

    restore_date4/26/11 9:56

    destination_database_nameQAI

    backup_set_id1909780

    restore_typeL

    replace0

    recovery0

    restart0

    stop_atNULL

    device_count1

    stop_at_mark_nameNULL

    stop_beforeNULL

    backup_set_id1909780

    backup_set_uuid7ECD61F5-27AA-4378-A09E-6801CBC0E6E8

    media_set_id1909778

    first_family_number1

    first_media_number1

    last_family_number1

    last_media_number1

    catalog_family_number1

    catalog_media_number1

    position1

    expiration_dateNULL

    software_vendor_id4608

    nameNULL

    descriptionNULL

    software_major_version9

    software_minor_version0

    software_build_version4035

    time_zone-16

    mtf_minor_version0

    first_lsn24837000011496300000

    last_lsn24922000001345600000

    checkpoint_lsn24903000010213400000

    database_backup_lsn19295000003680200000

    database_creation_date37:56.0

    backup_start_date46:01.0

    backup_finish_date52:08.0

    typeL

    sort_order52

    code_page0

    compatibility_level90

    database_version611

    backup_size5653515776

    database_nameQAI

    server_nameHLDSQL01

    machine_nameHLDSQL01

    flags512

    unicode_locale1033

    unicode_compare_style196609

    collation_nameSQL_Latin1_General_CP1_CI_AS

    is_password_protected0

    recovery_modelFULL

    has_bulk_logged_data0

    is_snapshot0

    is_readonly0

    is_single_user0

    has_backup_checksums0

    is_damaged0

    begins_log_chain0

    has_incomplete_metadata0

    is_force_offline0

    is_copy_only0

    first_recovery_fork_guidB12C1E27-CFEF-4A71-98F8-94C8DACC6994

    last_recovery_fork_guidB12C1E27-CFEF-4A71-98F8-94C8DACC6994

    fork_point_lsnNULL

    database_guidE10AB28E-3ADB-4055-AD59-F151742F8E76

    family_guidB12C1E27-CFEF-4A71-98F8-94C8DACC6994

    differential_base_lsnNULL

    differential_base_guidNULL

    Fast Server:

    restore_history_id722936

    restore_date4/16/11 18:47

    destination_database_nameQAI

    backup_set_id722917

    restore_typeL

    replace0

    recovery0

    restart0

    stop_atNULL

    device_count1

    stop_at_mark_nameNULL

    stop_beforeNULL

    backup_set_id722917

    backup_set_uuid7ECD61F5-27AA-4378-A09E-6801CBC0E6E8

    media_set_id722917

    first_family_number1

    first_media_number1

    last_family_number1

    last_media_number1

    catalog_family_number1

    catalog_media_number1

    position1

    expiration_dateNULL

    software_vendor_id4608

    nameNULL

    descriptionNULL

    software_major_version9

    software_minor_version0

    software_build_version4035

    time_zone-16

    mtf_minor_version0

    first_lsn24837000011496300000

    last_lsn24922000001345600000

    checkpoint_lsn24903000010213400000

    database_backup_lsn19295000003680200000

    database_creation_date37:56.0

    backup_start_date46:01.0

    backup_finish_date52:08.0

    typeL

    sort_order52

    code_page0

    compatibility_level90

    database_version611

    backup_size5653515776

    database_nameQAI

    server_nameHLDSQL01

    machine_nameHLDSQL01

    flags512

    unicode_locale1033

    unicode_compare_style196609

    collation_nameSQL_Latin1_General_CP1_CI_AS

    is_password_protected0

    recovery_modelFULL

    has_bulk_logged_data0

    is_snapshot0

    is_readonly0

    is_single_user0

    has_backup_checksums0

    is_damaged0

    begins_log_chain0

    has_incomplete_metadata0

    is_force_offline0

    is_copy_only0

    first_recovery_fork_guidB12C1E27-CFEF-4A71-98F8-94C8DACC6994

    last_recovery_fork_guidB12C1E27-CFEF-4A71-98F8-94C8DACC6994

    fork_point_lsnNULL

    database_guidE10AB28E-3ADB-4055-AD59-F151742F8E76

    family_guidB12C1E27-CFEF-4A71-98F8-94C8DACC6994

    differential_base_lsnNULL

    differential_base_guidNULL

  • That's certainly a big difference of 8 hours between the 2 servers to restore that transaction log. I know it's a long shot, but any chance the SQL Server process is running below normal priority?

    SQL BAK Explorer - read SQL Server backup file details without SQL Server.
    Supports backup files created with SQL Server 2005 up to SQL Server 2017.

  • It's running at normal priority...I did try setting it to high but it didn't make any difference.

  • Any chance it has something to do with the drive where the undo file is located, or perhaps you have some filter drivers that's intercepting the disk calls and performing some additional processing on the data/trx log/undo files e.g. the Hyperbac products?

    SQL BAK Explorer - read SQL Server backup file details without SQL Server.
    Supports backup files created with SQL Server 2005 up to SQL Server 2017.

  • Did anyone find any culprit to this?

    We have similar situation on DR, one day log restore for all 40 DBs in STANDBY mode went up by about 15 times per backup file.

    Nothing has changed on the box.

    I checked on of the small DBs - has only 12 VLFs and no trans in some backups, takes "same exact" time as a restore of significantly larger tran logs.

    It is a powerful DR box with ~1Tb of RAM allocated to SQL with nothing else running on the machine.

    Any ideas?

    - Anthony

    -SQL 2K8 R2 10.50.4290.0 cluster, with 1Tb RAM, 64 cores.

  • No, we never did find a solution other than switching to "no recovery mode" and then back to standby once it was caught up or just backing up the source database and reestablishing the log shipping from scratch again. Fortunately, both disk and network speed have improved for us so much over the intervening years, starting again from scratch is pretty quick and easy now.

  • I had this issue in the past.

    The culprit was a third party tool where log files that were not purged.

    Redgate's SQL Backup didn't set a purge by default on the logs...so I had a server that had 100,000 log files.

    Once I deleted the old log files and set the purge on the Server Options (SQL Backup Log Files section) in SQL Backup, the speed returned.

    The servers I had involved the resources were very low and I couldn't figure out why it was taking so long.

    Unsure if this is your issue, just throwing that out there.

  • Makes sense. I think the log shipping restore job inspects each file to see whether it has LSNs to be restored...scanning all those files would take a long time I imagine. That wasn't our issue though. Thanks.

  • Unfortunately not the case here.

    LOG restore is slow for a single given log backup.

    What makes it even more bizarre is that after SQL services restart restore for that same given file returns to "fast" for a while and then goes back up.

    Need to do more thorough testing to make sure I'm not missing anything.

    Regards, Anthony.

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply