SQL Clone
SQLServerCentral is supported by Redgate
Log in  ::  Register  ::  Not logged in

But I Don't *Want* To "Restore from a backup of the database or repair the database!"

Do you like Log Shipping?

Before my time at Ntirety I had done quite a bit of work with failover clustering and mirroring (trending into some Availability Groups) but I had never really done that much replication or log shipping.  Even at my previous managed services remote DBA job I simply hadn't run across it...

I have said in this blog more than once that education (training and real-world) is one of the most important parts of being a DBA, and that we should always be learning; my time at Ntirety has definitely educated me about replication and log shipping!

This Log Shipping story (maybe its a cautionary tale - so many of these stories are) begins with a failed server migration.  We were trying to replace a server with a new VM and due to some issues completely unrelated to log shipping, we ended up reverting to the original server (maybe once we figure that out I will write about it too).  

The problem was that I had already dropped the log shipping from OldServer01.Database99 to LSSecondary.Database99 and created new log shipping from NewServer01.Database99 to LSSecondary.Database99.  Part of the backout plan was dropping *that* log shipping and recreating the original relationship between OldServer01 and LSSecondary.

I had the creation process scripted out, so I ran the scripts, made sure they succeeded, checked that LSSecondary.Database99 reflected the current data (we had it in Standyby/Read-Only) and then walked away from log shipping to continue the rest of the revert process.


Monday morning rolled around and one of the business users reported that Database99 on LSSecondary was in Restoring rather than Standby, and wanted me to check it out.  Sure enough, Database99 was in Restoring even though the LSRestore job wasn't running and there was no active RESTORE process running.


I check the properties of the log shipping secondary in the Management Studio GUI and they looked as expected:

You can see these same settings via T-SQL with the following query:
SELECT secondary_database
, restore_mode
, disconnect_users
, last_restored_file
FROM msdb.dbo.log_shipping_secondary_databases 

Since the Database shows restore_mode 1 (Standby) and disconnect_users 1 (true) there is no reason by the configuration that the database should be in RESTORING.

The next stop was the SQL Server Error Log.  This probably should have been my first stop, but since I had just set this log shipping up I thought of the configuration first.  The Error Log was a horror show:


Error: 3456, Severity: 16, State: 1.
Could not redo log record (31185729:63924:22), for transaction ID (0:0), on page (1:511238), database 'Database99' (database ID 5). Page: LSN = (31185729:63857:14), type = 16. Log: OpCode = 7, context 24, PrevPageLSN: (31185729:63804:511). Restore from a backup of the database, or repair the database


…followed by 100+ lines of Error Stack Dump <cue scary music>:


Using 'dbghelp.dll' version '4.0.5'
**Dump thread - spid = 0, EC = 0x00000002E61A20F0
***Stack Dump being sent to F:\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\LOG\SQLDump0001.txt
* *******************************************************************************
*   10/22/17 03:00:34 spid 55
* HandleAndNoteToErrorlog: Exception raised, major=34, minor=56, severity=16
* Input Buffer 506 bytes -
*             RESTORE LOG [Database99] FROM DISK = N'\\LSSecondary\
*  Database99\Database99_20171022053000.trn' WITH FILE = 1, STANDBY =
*   N'F:\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\DA
*  TA\Database99_20171022090018.tuf'
*  MODULE                          BASE      END       SIZE
* sqlservr                       0000000000A40000  00000000045EDFFF  03bae000
* ntdll                          0000000077510000  00000000776B8FFF  001a9000
* kernel32                       00000000772F0000  000000007740EFFF  0011f000
* KERNELBASE                     000007FEFD560000  000007FEFD5CBFFF  0006c000
* ADVAPI32                       000007FEFF730000  000007FEFF80AFFF  000db000
* msvcrt                         000007FEFED70000  000007FEFEE0EFFF  0009f000
* sechost                        000007FEFE5B0000  000007FEFE5CEFFF  0001f000
* RPCRT4                         000007FEFE630000  000007FEFE75CFFF  0012d000
* MSVCR80                        0000000074F50000  0000000075018FFF  000c9000
* MSVCP80                        0000000074BD0000  0000000074CD8FFF  00109000
* sqlos                          00000000745D0000  00000000745D6FFF  00007000
* Secur32                        000007FEFCFE0000  000007FEFCFEAFFF  0000b000
* SSPICLI                        000007FEFD260000  000007FEFD284FFF  00025000
* pdh                            000007FEF6F00000  000007FEF6F4DFFF  0004e000
* SHLWAPI                        000007FEFF680000  000007FEFF6F0FFF  00071000
* GDI32                          000007FEFEC20000  000007FEFEC86FFF  00067000
* USER32                         0000000077410000  0000000077509FFF  000fa000
* LPK                            000007FEFF810000  000007FEFF81DFFF  0000e000
* USP10                          000007FEFF190000  000007FEFF258FFF  000c9000
* USERENV                        000007FEFC740000  000007FEFC75DFFF  0001e000
* profapi                        000007FEFD3D0000  000007FEFD3DEFFF  0000f000
* WINMM                          000007FEF6EC0000  000007FEF6EFAFFF  0003b000
* IPHLPAPI                       000007FEFB670000  000007FEFB696FFF  00027000
* NSI                            000007FEFE620000  000007FEFE627FFF  00008000
* WINNSI                         000007FEFB660000  000007FEFB66AFFF  0000b000
* opends60                       00000000745E0000  00000000745E7FFF  00008000
* NETAPI32                       000007FEFAF00000  000007FEFAF15FFF  00016000
* netutils                       000007FEFC970000  000007FEFC97BFFF  0000c000
* srvcli                         000007FEFCF20000  000007FEFCF42FFF  00023000
* wkscli                         000007FEFAEE0000  000007FEFAEF4FFF  00015000
* LOGONCLI                       000007FEFCC80000  000007FEFCCAFFFF  00030000
* SAMCLI                         000007FEFA310000  000007FEFA323FFF  00014000
* BatchParser                    00000000743D0000  00000000743FCFFF  0002d000
* IMM32                          000007FEFF700000  000007FEFF72DFFF  0002e000
* MSCTF                          000007FEFEE10000  000007FEFEF18FFF  00109000
* psapi                          00000000776D0000  00000000776D6FFF  00007000
* instapi10                      0000000074AA0000  0000000074AACFFF  0000d000
* cscapi                         000007FEF6EB0000  000007FEF6EBEFFF  0000f000
* sqlevn70                       0000000071050000  0000000071250FFF  00201000
* ntmarta                        000007FEFC530000  000007FEFC55CFFF  0002d000
* WLDAP32                        000007FEFF130000  000007FEFF181FFF  00052000
* CRYPTSP                        000007FEFCBE0000  000007FEFCBF7FFF  00018000
* rsaenh                         000007FEFC9E0000  000007FEFCA26FFF  00047000
* CRYPTBASE                      000007FEFD300000  000007FEFD30EFFF  0000f000
* BROWCLI                        000007FEF6DE0000  000007FEF6DF1FFF  00012000
* AUTHZ                          000007FEFCE70000  000007FEFCE9EFFF  0002f000
* MSCOREE                        000007FEF9F90000  000007FEF9FFEFFF  0006f000
* mscoreei                       000007FEF9EE0000  000007FEF9F77FFF  00098000
* ole32                          000007FEFEF20000  000007FEFF122FFF  00203000
* credssp                        000007FEFC900000  000007FEFC909FFF  0000a000
* msv1_0                         000007FEFCB80000  000007FEFCBD1FFF  00052000
* cryptdll                       000007FEFCF50000  000007FEFCF63FFF  00014000
* kerberos                       000007FEFCCB0000  000007FEFCD67FFF  000b8000
* MSASN1                         000007FEFD4B0000  000007FEFD4BEFFF  0000f000
* schannel                       000007FEFC980000  000007FEFC9D6FFF  00057000
* CRYPT32                        000007FEFD630000  000007FEFD79CFFF  0016d000
* security                       0000000074210000  0000000074212FFF  00003000
* WS2_32                         000007FEFE5D0000  000007FEFE61CFFF  0004d000
* SHELL32                        000007FEFD820000  000007FEFE5A8FFF  00d89000
* OLEAUT32                       000007FEFEC90000  000007FEFED66FFF  000d7000
* ftimport                       0000000060000000  0000000060024FFF  00025000
* MSFTE                          0000000049980000  0000000049D2DFFF  003ae000
* VERSION                        000007FEFC520000  000007FEFC52BFFF  0000c000
* dbghelp                        000000006F0C0000  000000006F21DFFF  0015e000
* WINTRUST                       000007FEFD7A0000  000007FEFD7DAFFF  0003b000
* ncrypt                         000007FEFCE20000  000007FEFCE6FFFF  00050000
* bcrypt                         000007FEFCDF0000  000007FEFCE11FFF  00022000
* mswsock                        000007FEFCC20000  000007FEFCC74FFF  00055000
* wship6                         000007FEFCDE0000  000007FEFCDE6FFF  00007000
* wshtcpip                       000007FEFC620000  000007FEFC626FFF  00007000
* ntdsapi                        000007FEFB0A0000  000007FEFB0C6FFF  00027000
* DNSAPI                         000007FEFCA30000  000007FEFCA8AFFF  0005b000
* rasadhlp                       000007FEFAA70000  000007FEFAA77FFF  00008000
* fwpuclnt                       000007FEFB560000  000007FEFB5B2FFF  00053000
* bcryptprimitives               000007FEFC7A0000  000007FEFC7EBFFF  0004c000
* CLBCatQ                        000007FEFF560000  000007FEFF5F8FFF  00099000
* sqlncli10                      0000000072690000  0000000072947FFF  002b8000
* COMCTL32                       000007FEF6F50000  000007FEF6FEFFFF  000a0000
* COMDLG32                       000007FEFF4C0000  000007FEFF556FFF  00097000
* SQLNCLIR10                     0000000071260000  0000000071296FFF  00037000
* netbios                        000007FEF4040000  000007FEF4049FFF  0000a000
* xpsqlbot                       0000000074270000  0000000074277FFF  00008000
* xpstar                         0000000065380000  0000000065407FFF  00088000
* SQLSCM                         0000000074590000  000000007459DFFF  0000e000
* ODBC32                         000007FEF2BA0000  000007FEF2C50FFF  000b1000
* ATL80                          0000000074F00000  0000000074F1FFFF  00020000
* odbcint                        00000000733D0000  0000000073407FFF  00038000
* clusapi                        000007FEF6760000  000007FEF67AFFFF  00050000
* resutils                       000007FEF6740000  000007FEF6758FFF  00019000
* xpstar                         0000000074240000  0000000074264FFF  00025000
* xplog70                        0000000074280000  000000007428FFFF  00010000
* xplog70                        0000000074220000  0000000074221FFF  00002000
* dsrole                         000007FEFBCD0000  000007FEFBCDBFFF  0000c000
* RpcRtRemote                    000007FEFD3B0000  000007FEFD3C3FFF  00014000
* ssdebugps                      000007FEEAE20000  000007FEEAE31FFF  00012000
* apphelp                        000007FEFD2A0000  000007FEFD2F6FFF  00057000
* msxmlsql                       00000000632C0000  0000000063410FFF  00151000
* dbghelp                        000000006F8D0000  000000006FA2DFFF  0015e000
*     P1Home: 0044004C00510053:  
*     P2Home: 00000000004A87B4:  0122002501210023  0124002900230027  0126002D0025002B  012800310127002F  012A003501290033  002C0039002B0037  
*     P3Home: 00000000004C8310:  000000002A010FF0  000000002A010FF0  0000000017407610  000000001AA86DD0  000000001736E3F0  000000002A027FD0  
*     P4Home: 0000000000000000:  
*     P5Home: 0052004500AE00AC:  
*     P6Home: 000000001742C658:  0050005C003A0046  00720067006F0072  00460020006D0061  00730065006C0069  00630069004D005C  006F0073006F0072  
* ContextFlags: 000000000010000F:  0000000000000000  0000000000000000  0000000000000000  0000000000000000  0000000000000000  0000000000000000  
*      MxCsr: 0000000000001FA0:  
*      SegCs: 0000000000000033:  
*      SegDs: 000000000000002B:  
*      SegEs: 000000000000002B:  
*      SegFs: 0000000000000053:  
*      SegGs: 000000000000002B:  
*      SegSs: 000000000000002B:  
*     EFlags: 0000000000000202:  
*        Rax: 0000000000991AD3:  0000000000000000  0000000000000000  0000000000000000  0000000000000000  0000000000000000  0000000000000000  
*        Rcx: 000000002C568260:  0044004C00510053  00000000004A87B4  00000000004C8310  0000000000000000  0052004500AE00AC  000000001742C658  
*        Rdx: 0000000000000000:  
*        Rbx: 0000000000000000:  
*        Rsp: 000000002C568870:  0000000000000000  00000002E61A20F0  000000002C568AA0  000007FFFFEDC000  00000000000042AC  0000000000000000  
*        Rbp: 0000000000000000:  
*        Rsi: 00000002E61A20F0:  00000002E61A20F0  000000063039CEA0  0000000000000000  0000000000000004  00000002E61A2110  00000002E61A2110  
*        Rdi: 000000002C568AA0:  000000001742C650  FFFFFFFFFFFFFFFE  000007FFFFEDB4A8  0000000000000000  00000006699981A0  0000000000A426C3  
*         R8: 0000000000000000:  
*         R9: 0000000000000000:  
*        R10: 0000000000000000:  
*        R11: 000000002C569080:  0044004C00510053  00000000004A87B4  00000000004C8310  0000000000000000  0052004500AE00AC  000000001742C658  
*        R12: 0000000000000001:  
*        R13: 000000063039CEA0:  000000063039C080  000000063039C3B0  000000063039CAD0  00000002E61A2080  000000063039CF10  000000008089CE80  
*        R14: 000000002C569F60:  0064006E00610048  006E00410065006C  0074006F004E0064  0045006F00540065  0072006F00720072  003A0067006F006C  
*        R15: 000000000000003F:  
*        Rip: 000007FEFD56B3DD:  C3000000C8C48148  003B830874F68548  0001BD1689660376  02D6840FFD3B0000  F3840F02FF830000  870F07FF83000002  
* *******************************************************************************
* -------------------------------------------------------------------------------
* Short Stack Dump
000007FEFD56B3DD Module(KERNELBASE+000000000000B3DD)
0000000002AE981D Module(sqlservr+00000000020A981D)
0000000002AEDD3A Module(sqlservr+00000000020ADD3A)
0000000002B210C3 Module(sqlservr+00000000020E10C3)
0000000002B20F5D Module(sqlservr+00000000020E0F5D)
0000000000E3F5D8 Module(sqlservr+00000000003FF5D8)
0000000000E3F2A6 Module(sqlservr+00000000003FF2A6)
0000000002A47DBB Module(sqlservr+0000000002007DBB)
0000000002B24941 Module(sqlservr+00000000020E4941)
0000000002D14BFB Module(sqlservr+00000000022D4BFB)
0000000002CC0A44 Module(sqlservr+0000000002280A44)
0000000002CC0719 Module(sqlservr+0000000002280719)
0000000002CBFDC5 Module(sqlservr+000000000227FDC5)
0000000002D4C1B4 Module(sqlservr+000000000230C1B4)
0000000002D4B845 Module(sqlservr+000000000230B845)
0000000002CA321E Module(sqlservr+000000000226321E)
0000000002D4DD33 Module(sqlservr+000000000230DD33)
0000000002D52D0F Module(sqlservr+0000000002312D0F)
0000000002CC8CE6 Module(sqlservr+0000000002288CE6)
0000000001F288A1 Module(sqlservr+00000000014E88A1)
0000000000A99D59 Module(sqlservr+0000000000059D59)
0000000000A9A9B8 Module(sqlservr+000000000005A9B8)
0000000000A9B30C Module(sqlservr+000000000005B30C)
0000000000A9C1A6 Module(sqlservr+000000000005C1A6)
0000000000AE5342 Module(sqlservr+00000000000A5342)
0000000000A4BBD8 Module(sqlservr+000000000000BBD8)
0000000000A4B8BA Module(sqlservr+000000000000B8BA)
0000000000A4B6FF Module(sqlservr+000000000000B6FF)
0000000000F68FB6 Module(sqlservr+0000000000528FB6)
0000000000F69175 Module(sqlservr+0000000000529175)
0000000000F69839 Module(sqlservr+0000000000529839)
0000000000F69502 Module(sqlservr+0000000000529502)
0000000074F537D7 Module(MSVCR80+00000000000037D7)
0000000074F53894 Module(MSVCR80+0000000000003894)
00000000773059CD Module(kernel32+00000000000159CD)
000000007753B981 Module(ntdll+000000000002B981)
Stack Signature for the dump is 0x000000012832F01D
External dump process return code 0x20000001.  External dump process returned no errors. 


... followed by:


Error: 9003, Severity: 20, State: 6.
The log scan number (31185729:63630:107) passed to log scan in database 'Database99' is not valid. This error may indicate data corruption or that the log file (.ldf) does not match the data file (.mdf). If this error occurred during replication, re-create the publication. Otherwise, restore from backup if the problem results in a failure during startup. 


Pretty scary looking errors…I checked the database on OldServer01, especially since the CheckDB job hadn't run in several days (back to before the migration attempt), and thankfully it was clear:


Date and time: 2017-10-23 10:49:35
Server: OldServer01
Version: 10.50.4305.0
Edition: Enterprise Edition (64-bit)
Procedure: [master].[dbo].[DatabaseIntegrityCheck]
Parameters: @Databases = 'Database99', @CheckCommands = 'CHECKDB', @PhysicalOnly = 'N', @NoIndex = 'N', @ExtendedLogicalChecks = 'N', @TabLock = 'N'
, @FileGroups = NULL, @Objects = NULL, @LockTimeout = NULL, @LogToTable = 'N', @Execute = 'Y'
Date and time: 2017-10-23 10:49:35
Database: [Database99]
Status: ONLINE
Standby: No
Updateability: READ_WRITE
User access: MULTI_USER
Is accessible: Yes
Recovery model: FULL
Date and time: 2017-10-23 10:49:35
Outcome: Succeeded
Duration: 00:35:44
Date and time: 2017-10-23 11:25:19


After some poking around and several rebuilds of log shipping (deleting the secondary, taking a new FULL on OldServer01, restoring it to LSSecondary, and then recreating the process) I figured out the issue was related to the first part of the first error: 

“Could not redo log record (31185747:40079:1), for transaction ID (0:1129634848), on page (1:10522412), database 'Database99'” 

The redo reader process was looking in the log shipping folder \\LSSecondary\Database99 and trying to restore the oldest TRN log backup files, which were related to the log shipping configuration from before the migration/fail-back over the weekend.  Those older TRN files had log sequence numbers (LSNs) that were from before the FULL backup used to initialize the re-created log shipping, which was making the overall process error out and blowing up the process.

A colleague suggested that the log shipping restore should be skipping TRN files that are "too old" via LSN (which matched my understanding of how it "should" work) and that maybe this reflected one of the old TRN files having an invalid file header.  After some consideration we realized that probably was *not* the case since the error message doesn't reflect not being able to read the backup file header, and if it couldn't read the header how would it have a log record number (31185747:40079:1) to include in the "could not redo" portion of the error?

I deleted the old backups from \\LSSecondary\Database99 and tried to run the process again, and received the same errors.  When I looked in \\LSSecondary\Database99 I saw all of the old TRN files were back!

I had failed to consider (sigh) that the LSCopy job would bring all of the "old" files from the primary folder at \\OldServer01\Database99 back to \\LSSecondary\Database99, putting me back in the same situation.

Take 2 - I deleted all of the log shipping backups from *both* \\OldServer01\Database99 and \\LSSecondary\Database99 and then re-created the log shipping one last time…

This time the LS_Restore job functioned as expected, Database99 was in Standby/Read-Only on LSSecondary, and instead of my 100+ line stack dump I received some very respectable information messages in the SQL Error Log on LSSecondary:

Setting database option SINGLE_USER to ON for database Database99.
Starting up database 'Database99'.
The database 'Database99' is marked RESTORING and is in a state that does not allow recovery to be run.
Recovery is writing a checkpoint in database 'Database99' (5). This is an informational message only. No user action is required.
Recovery completed for database Database99 (database ID 5) in 2 second(s) (analysis 1080 ms, redo 0 ms, undo 466 ms.) This is an informational message only. No user action is required.
Starting up database 'Database99'.
CHECKDB for database 'Database99' finished without errors on 2017-10-23 10:49:44.400 (local time). This is an informational message only; no user action is required.
Log was restored. Database: Database99, creation date(time): 2017/10/22(02:20:41), first LSN: 31185760:34887:1, last LSN: 31185760:44042:1, number of dump devices: 1, device information: (FILE=1, TYPE=DISK: {'\\LSSecondary\Database99\Database99_20171023204500.trn'}). This is an informational message. No user action is required.
Setting database option MULTI_USER to ON for database Database99.

The SINGLE_USER and MULTI_USER messages are artifacts of telling the log shipping to disconnect users when restoring backups.


Hope this helps!

Nebraska SQL from @DBA_ANDY

I’m a forty-something Microsoft SQL Server DBA of 15+ years, a devoted husband, and a father of three young boys. I have been a DBA at a public university, at a major bank, at a healthcare system, and I now work as a remote DBA with customers across the United States. I write and speak primarily about the tips and tricks that I discover along my SQL Server journey.


Leave a comment on the original post [nebraskasql.blogspot.com, opens in a new window]

Loading comments...