Log shipping secondary database occasionally corrupting

Question

Log shipping secondary database occasionally corrupting

zoggling

SSCrazy

Points: 2234
More actions
May 16, 2018 at 8:09 am

#344635

Log shipping has been configured between two of our SQL Server instances in different data centres, and has been working fine for many months. However, occasionally the LS_Alert job reports that one of the secondary databases is out of sync. On closer inspection, the secondary database in question is no longer in "Standby/Read-Only" mode but is instead "Restoring...". From this point on, every time the log shipping restore job attempts to restore the latest transaction logs to this secondary database, the following error occurs:
*** Error: An error occurred while processing the log for database '********'. If possible, restore from backup. If a backup is not available, it might be necessary to rebuild the log.
An error occurred during recovery, preventing the database '********' (12:0) from restarting. Diagnose the recovery errors and fix them, or restore from a known good backup. If errors are not corrected or expected, contact Technical Support.
RESTORE LOG is terminating abnormally.
Processed 0 pages for database '********', file '********' on file 1.
Processed 2 pages for database '********', file '********_Log' on file 1.(.Net SqlClient Data Provider) ***
On multiple occasions now we have had to resort to restoring the latest primary database backup over the secondary database (in standby mode) to resolve this issue. However, the issue has occurred three times in the past two weeks alone, and has occurred many times in different environments and databases in our experience.
This issue is not limited to SQL Server 2016; we have experienced this issue with log shipping to secondary databases in SQL Server 2012 and 2014 too.
Has anyone else experienced this? Does anyone know the cause? We previously posted a similar thread at the link below, but never got to the bottom as to why this is happening.
https://www.sqlservercentral.com/Forums/Topic1727457-3411-1.aspx

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply

harikumar.mindi SSCommitted Points: 1972 More actions · Answer 1

I would assume log backup file corruption. It could be either during the backup in the source or after copying to the destination.
>>Check for the log file size (sometimes it can 0KB which is corrupt)
>>Try RESTORE VERIFYONLY on the logfiles and see if it reports any corruption. Specially the last ones restored before the error occurred.
>>If possible try manual restore

Sue_H SSC Guru Points: 90860 More actions · Answer 2

zoggling - Wednesday, May 16, 2018 8:09 AM
Log shipping has been configured between two of our SQL Server instances in different data centres, and has been working fine for many months. However, occasionally the LS_Alert job reports that one of the secondary databases is out of sync. On closer inspection, the secondary database in question is no longer in "Standby/Read-Only" mode but is instead "Restoring...". From this point on, every time the log shipping restore job attempts to restore the latest transaction logs to this secondary database, the following error occurs:
*** Error: An error occurred while processing the log for database '********'. If possible, restore from backup. If a backup is not available, it might be necessary to rebuild the log.
An error occurred during recovery, preventing the database '********' (12:0) from restarting. Diagnose the recovery errors and fix them, or restore from a known good backup. If errors are not corrected or expected, contact Technical Support.
RESTORE LOG is terminating abnormally.
Processed 0 pages for database '********', file '********' on file 1.
Processed 2 pages for database '********', file '********_Log' on file 1.(.Net SqlClient Data Provider) ***
On multiple occasions now we have had to resort to restoring the latest primary database backup over the secondary database (in standby mode) to resolve this issue. However, the issue has occurred three times in the past two weeks alone, and has occurred many times in different environments and databases in our experience.
This issue is not limited to SQL Server 2016; we have experienced this issue with log shipping to secondary databases in SQL Server 2012 and 2014 too.
Has anyone else experienced this? Does anyone know the cause? We previously posted a similar thread at the link below, but never got to the bottom as to why this is happening.
https://www.sqlservercentral.com/Forums/Topic1727457-3411-1.aspx

If it's happening in different environments and different databases I would suspect something being not so good with your storage or some other piece in your I/O subsystem. Look at what is common in all of those environments - using same SAN? Does server has the same NIC and same drivers? Is antivirus software excluding the data/log directories for all servers? Things along those lines.

Sue

zoggling SSCrazy Points: 2234 More actions · Answer 3

harikumar.mindi - Wednesday, May 16, 2018 9:17 AM
I would assume log backup file corruption. It could be either during the backup in the source or after copying to the destination.
>>Check for the log file size (sometimes it can 0KB which is corrupt)
>>Try RESTORE VERIFYONLY on the logfiles and see if it reports any corruption. Specially the last ones restored before the error occurred.
>>If possible try manual restore

Thanks harikumar.mindi, unfortunately corruption of the log backup does not appear to be the issue, since once the previous backup had been restored over the top of the standby database, all transaction logs (including the problematic one) restored without any issue.

zoggling SSCrazy Points: 2234 More actions · Answer 4

Sue_H - Wednesday, May 16, 2018 10:13 AM
zoggling - Wednesday, May 16, 2018 8:09 AM
Log shipping has been configured between two of our SQL Server instances in different data centres, and has been working fine for many months. However, occasionally the LS_Alert job reports that one of the secondary databases is out of sync. On closer inspection, the secondary database in question is no longer in "Standby/Read-Only" mode but is instead "Restoring...". From this point on, every time the log shipping restore job attempts to restore the latest transaction logs to this secondary database, the following error occurs:
*** Error: An error occurred while processing the log for database '********'. If possible, restore from backup. If a backup is not available, it might be necessary to rebuild the log.
An error occurred during recovery, preventing the database '********' (12:0) from restarting. Diagnose the recovery errors and fix them, or restore from a known good backup. If errors are not corrected or expected, contact Technical Support.
RESTORE LOG is terminating abnormally.
Processed 0 pages for database '********', file '********' on file 1.
Processed 2 pages for database '********', file '********_Log' on file 1.(.Net SqlClient Data Provider) ***
On multiple occasions now we have had to resort to restoring the latest primary database backup over the secondary database (in standby mode) to resolve this issue. However, the issue has occurred three times in the past two weeks alone, and has occurred many times in different environments and databases in our experience.
This issue is not limited to SQL Server 2016; we have experienced this issue with log shipping to secondary databases in SQL Server 2012 and 2014 too.
Has anyone else experienced this? Does anyone know the cause? We previously posted a similar thread at the link below, but never got to the bottom as to why this is happening.
https://www.sqlservercentral.com/Forums/Topic1727457-3411-1.aspx
If it's happening in different environments and different databases I would suspect something being not so good with your storage or some other piece in your I/O subsystem. Look at what is common in all of those environments - using same SAN? Does server has the same NIC and same drivers? Is antivirus software excluding the data/log directories for all servers? Things along those lines.
Sue

Network/storage was a suspected cause in the previous forum thread too. It is a possibility, how would one go about proving this? The problem occurs infrequently, and without warning. Other logs (such as Event Viewer) do not appear to highlight any issues.

harikumar.mindi SSCommitted Points: 1972 More actions · Answer 5

zoggling - Thursday, May 17, 2018 2:23 AM
harikumar.mindi - Wednesday, May 16, 2018 9:17 AM
I would assume log backup file corruption. It could be either during the backup in the source or after copying to the destination.
>>Check for the log file size (sometimes it can 0KB which is corrupt)
>>Try RESTORE VERIFYONLY on the logfiles and see if it reports any corruption. Specially the last ones restored before the error occurred.
>>If possible try manual restore
Thanks harikumar.mindi, unfortunately corruption of the log backup does not appear to be the issue, since once the previous backup had been restored over the top of the standby database, all transaction logs (including the problematic one) restored without any issue.

Could it be because of any existing connections to the database ?
Try configuring NO RECOVERY instead of STANDBY mode for logshipping
Set up a delay in restoring the log files

Sue_H SSC Guru Points: 90860 More actions · Answer 6

zoggling - Thursday, May 17, 2018 2:27 AM
Sue_H - Wednesday, May 16, 2018 10:13 AM
zoggling - Wednesday, May 16, 2018 8:09 AM
Log shipping has been configured between two of our SQL Server instances in different data centres, and has been working fine for many months. However, occasionally the LS_Alert job reports that one of the secondary databases is out of sync. On closer inspection, the secondary database in question is no longer in "Standby/Read-Only" mode but is instead "Restoring...". From this point on, every time the log shipping restore job attempts to restore the latest transaction logs to this secondary database, the following error occurs:
*** Error: An error occurred while processing the log for database '********'. If possible, restore from backup. If a backup is not available, it might be necessary to rebuild the log.
An error occurred during recovery, preventing the database '********' (12:0) from restarting. Diagnose the recovery errors and fix them, or restore from a known good backup. If errors are not corrected or expected, contact Technical Support.
RESTORE LOG is terminating abnormally.
Processed 0 pages for database '********', file '********' on file 1.
Processed 2 pages for database '********', file '********_Log' on file 1.(.Net SqlClient Data Provider) ***
On multiple occasions now we have had to resort to restoring the latest primary database backup over the secondary database (in standby mode) to resolve this issue. However, the issue has occurred three times in the past two weeks alone, and has occurred many times in different environments and databases in our experience.
This issue is not limited to SQL Server 2016; we have experienced this issue with log shipping to secondary databases in SQL Server 2012 and 2014 too.
Has anyone else experienced this? Does anyone know the cause? We previously posted a similar thread at the link below, but never got to the bottom as to why this is happening.
https://www.sqlservercentral.com/Forums/Topic1727457-3411-1.aspx
If it's happening in different environments and different databases I would suspect something being not so good with your storage or some other piece in your I/O subsystem. Look at what is common in all of those environments - using same SAN? Does server has the same NIC and same drivers? Is antivirus software excluding the data/log directories for all servers? Things along those lines.
Sue
Network/storage was a suspected cause in the previous forum thread too. It is a possibility, how would one go about proving this? The problem occurs infrequently, and without warning. Other logs (such as Event Viewer) do not appear to highlight any issues.

By checking all of the components in the I/O subsystem path and comparing things among the server having issues. Event logs and intermittent issues don't necessarily rule it out. Start by checking if the servers with the issues are all the same models, same components. Check the NICs, drivers being used, Are they all using multipathing, are they all using the same pool of disks, etc. Once we had intermittent I/O stalls for quite a few instances. Nothing else logged anywhere when it happened, diagnostics run on a lot of the servers and nothing was found. Then coincidentally a few of the instances all had the IO stalls logged at the exact same time which helped isolate things a bit. It ended up being a problem with a port on one of the switches and due to multipathing, it was intermittent, affecting different instances with no real pattern. It's not necessarily easy to track down issues like what you have. You need to figure out what you want to check on the different servers and just work through it over time. It takes some perseverance.

Sue
.

zoggling SSCrazy Points: 2234 More actions · Answer 7

zoggling

SSCrazy

Points: 2234

May 17, 2018 at 9:19 am

#1990865

Hi Sue, thanks very much for your advice!