Log shipping secondary database occasionally corrupting

  • Log shipping has been configured between two of our SQL Server instances in different data centres, and has been working fine for many months. However, occasionally the LS_Alert job reports that one of the secondary databases is out of sync. On closer inspection, the secondary database in question is no longer in "Standby/Read-Only" mode but is instead "Restoring...". From this point on, every time the log shipping restore job attempts to restore the latest transaction logs to this secondary database, the following error occurs:

    *** Error: An error occurred while processing the log for database '********'. If possible, restore from backup. If a backup is not available, it might be necessary to rebuild the log.
    An error occurred during recovery, preventing the database '********' (12:0) from restarting. Diagnose the recovery errors and fix them, or restore from a known good backup. If errors are not corrected or expected, contact Technical Support.
    RESTORE LOG is terminating abnormally.
    Processed 0 pages for database '********', file '********' on file 1.
    Processed 2 pages for database '********', file '********_Log' on file 1.(.Net SqlClient Data Provider) ***

    On multiple occasions now we have had to resort to restoring the latest primary database backup over the secondary database (in standby mode) to resolve this issue. However, the issue has occurred three times in the past two weeks alone, and has occurred many times in different environments and databases in our experience.

    This issue is not limited to SQL Server 2016; we have experienced this issue with log shipping to secondary databases in SQL Server 2012 and 2014 too.

    Has anyone else experienced this? Does anyone know the cause? We previously posted a similar thread at the link below, but never got to the bottom as to why this is happening.

    https://www.sqlservercentral.com/Forums/Topic1727457-3411-1.aspx

  • I would assume log backup file corruption. It could be either during the backup in the source or after copying to the destination.
    >>Check for the log file size (sometimes it can 0KB which is corrupt)
    >>Try RESTORE VERIFYONLY on the logfiles and see if it reports any corruption. Specially the last ones restored before the error occurred.
    >>If possible try manual restore

  • zoggling - Wednesday, May 16, 2018 8:09 AM

    Log shipping has been configured between two of our SQL Server instances in different data centres, and has been working fine for many months. However, occasionally the LS_Alert job reports that one of the secondary databases is out of sync. On closer inspection, the secondary database in question is no longer in "Standby/Read-Only" mode but is instead "Restoring...". From this point on, every time the log shipping restore job attempts to restore the latest transaction logs to this secondary database, the following error occurs:

    *** Error: An error occurred while processing the log for database '********'. If possible, restore from backup. If a backup is not available, it might be necessary to rebuild the log.
    An error occurred during recovery, preventing the database '********' (12:0) from restarting. Diagnose the recovery errors and fix them, or restore from a known good backup. If errors are not corrected or expected, contact Technical Support.
    RESTORE LOG is terminating abnormally.
    Processed 0 pages for database '********', file '********' on file 1.
    Processed 2 pages for database '********', file '********_Log' on file 1.(.Net SqlClient Data Provider) ***

    On multiple occasions now we have had to resort to restoring the latest primary database backup over the secondary database (in standby mode) to resolve this issue. However, the issue has occurred three times in the past two weeks alone, and has occurred many times in different environments and databases in our experience.

    This issue is not limited to SQL Server 2016; we have experienced this issue with log shipping to secondary databases in SQL Server 2012 and 2014 too.

    Has anyone else experienced this? Does anyone know the cause? We previously posted a similar thread at the link below, but never got to the bottom as to why this is happening.

    https://www.sqlservercentral.com/Forums/Topic1727457-3411-1.aspx

    If it's happening in different environments and different databases I would suspect something being not so good with your storage or some other piece in your I/O subsystem. Look at what is common in all of those environments - using same SAN? Does server has the same NIC and same drivers? Is antivirus software excluding the data/log directories for all servers? Things along those lines.

    Sue

  • harikumar.mindi - Wednesday, May 16, 2018 9:17 AM

    I would assume log backup file corruption. It could be either during the backup in the source or after copying to the destination.
    >>Check for the log file size (sometimes it can 0KB which is corrupt)
    >>Try RESTORE VERIFYONLY on the logfiles and see if it reports any corruption. Specially the last ones restored before the error occurred.
    >>If possible try manual restore

    Thanks harikumar.mindi, unfortunately corruption of the log backup does not appear to be the issue, since once the previous backup had been restored over the top of the standby database, all transaction logs (including the problematic one) restored without any issue.

  • Sue_H - Wednesday, May 16, 2018 10:13 AM

    zoggling - Wednesday, May 16, 2018 8:09 AM

    Log shipping has been configured between two of our SQL Server instances in different data centres, and has been working fine for many months. However, occasionally the LS_Alert job reports that one of the secondary databases is out of sync. On closer inspection, the secondary database in question is no longer in "Standby/Read-Only" mode but is instead "Restoring...". From this point on, every time the log shipping restore job attempts to restore the latest transaction logs to this secondary database, the following error occurs:

    *** Error: An error occurred while processing the log for database '********'. If possible, restore from backup. If a backup is not available, it might be necessary to rebuild the log.
    An error occurred during recovery, preventing the database '********' (12:0) from restarting. Diagnose the recovery errors and fix them, or restore from a known good backup. If errors are not corrected or expected, contact Technical Support.
    RESTORE LOG is terminating abnormally.
    Processed 0 pages for database '********', file '********' on file 1.
    Processed 2 pages for database '********', file '********_Log' on file 1.(.Net SqlClient Data Provider) ***

    On multiple occasions now we have had to resort to restoring the latest primary database backup over the secondary database (in standby mode) to resolve this issue. However, the issue has occurred three times in the past two weeks alone, and has occurred many times in different environments and databases in our experience.

    This issue is not limited to SQL Server 2016; we have experienced this issue with log shipping to secondary databases in SQL Server 2012 and 2014 too.

    Has anyone else experienced this? Does anyone know the cause? We previously posted a similar thread at the link below, but never got to the bottom as to why this is happening.

    https://www.sqlservercentral.com/Forums/Topic1727457-3411-1.aspx

    If it's happening in different environments and different databases I would suspect something being not so good with your storage or some other piece in your I/O subsystem. Look at what is common in all of those environments - using same SAN? Does server has the same NIC and same drivers? Is antivirus software excluding the data/log directories for all servers? Things along those lines.

    Sue

    Network/storage was a suspected cause in the previous forum thread too. It is a possibility, how would one go about proving this? The problem occurs infrequently, and without warning. Other logs (such as Event Viewer) do not appear to highlight any issues.

  • zoggling - Thursday, May 17, 2018 2:23 AM

    harikumar.mindi - Wednesday, May 16, 2018 9:17 AM

    I would assume log backup file corruption. It could be either during the backup in the source or after copying to the destination.
    >>Check for the log file size (sometimes it can 0KB which is corrupt)
    >>Try RESTORE VERIFYONLY on the logfiles and see if it reports any corruption. Specially the last ones restored before the error occurred.
    >>If possible try manual restore

    Thanks harikumar.mindi, unfortunately corruption of the log backup does not appear to be the issue, since once the previous backup had been restored over the top of the standby database, all transaction logs (including the problematic one) restored without any issue.

    Could it be because of any existing connections to the database ?
    Try configuring NO RECOVERY instead of STANDBY mode for logshipping
    Set up a delay in restoring the log files

  • zoggling - Thursday, May 17, 2018 2:27 AM

    Sue_H - Wednesday, May 16, 2018 10:13 AM

    zoggling - Wednesday, May 16, 2018 8:09 AM

    Log shipping has been configured between two of our SQL Server instances in different data centres, and has been working fine for many months. However, occasionally the LS_Alert job reports that one of the secondary databases is out of sync. On closer inspection, the secondary database in question is no longer in "Standby/Read-Only" mode but is instead "Restoring...". From this point on, every time the log shipping restore job attempts to restore the latest transaction logs to this secondary database, the following error occurs:

    *** Error: An error occurred while processing the log for database '********'. If possible, restore from backup. If a backup is not available, it might be necessary to rebuild the log.
    An error occurred during recovery, preventing the database '********' (12:0) from restarting. Diagnose the recovery errors and fix them, or restore from a known good backup. If errors are not corrected or expected, contact Technical Support.
    RESTORE LOG is terminating abnormally.
    Processed 0 pages for database '********', file '********' on file 1.
    Processed 2 pages for database '********', file '********_Log' on file 1.(.Net SqlClient Data Provider) ***

    On multiple occasions now we have had to resort to restoring the latest primary database backup over the secondary database (in standby mode) to resolve this issue. However, the issue has occurred three times in the past two weeks alone, and has occurred many times in different environments and databases in our experience.

    This issue is not limited to SQL Server 2016; we have experienced this issue with log shipping to secondary databases in SQL Server 2012 and 2014 too.

    Has anyone else experienced this? Does anyone know the cause? We previously posted a similar thread at the link below, but never got to the bottom as to why this is happening.

    https://www.sqlservercentral.com/Forums/Topic1727457-3411-1.aspx

    If it's happening in different environments and different databases I would suspect something being not so good with your storage or some other piece in your I/O subsystem. Look at what is common in all of those environments - using same SAN? Does server has the same NIC and same drivers? Is antivirus software excluding the data/log directories for all servers? Things along those lines.

    Sue

    Network/storage was a suspected cause in the previous forum thread too. It is a possibility, how would one go about proving this? The problem occurs infrequently, and without warning. Other logs (such as Event Viewer) do not appear to highlight any issues.

    By checking all of the components in the I/O subsystem path and comparing things among the server having issues. Event logs and intermittent issues don't necessarily rule it out. Start by checking if the servers with the issues are all the same models, same components. Check the NICs, drivers being used, Are they all using multipathing, are they all using the same pool of disks, etc. Once we had intermittent I/O stalls for quite a few instances. Nothing else logged anywhere when it happened, diagnostics run on a lot of the servers and nothing was found. Then coincidentally a few of the instances all had the IO stalls logged at the exact same time which helped isolate things a bit. It ended up being a problem with a port on one of the switches and due to multipathing, it was intermittent, affecting different instances with no real pattern.  It's not necessarily easy to track down issues like what you have. You need to figure out what you want to check on the different servers and just work through it over time. It takes some perseverance.

    Sue
    .

  • Hi Sue, thanks very much for your advice!

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply