DB restore stucks after recovery step

  • Hi, trying to restore my db for testing https://www.sqlservercentral.com/Forums/1862799/checkdb-Multiple-IAM-pages-for-object
    I splitted the restore process into 2 steps, i restore full with norecovery and now a differential restore with recovery,
    this is the  diff restore  command:
    RESTORE DATABASE [dbxxxxx] FROM DISK = N'x:\file.diff' WITH FILE = 1, MOVE N'dbxxxxx' TO N'x:\dbxxxxx.mdf', MOVE ....
    .... (18K+)
    MAXTRANSFERSIZE=2097152,
    BUFFERCOUNT=1100, STATS = 5;

    The full backup restore tooks 5hours (1 TB), but the differential restore(220GB) has been ongoing for about 6 hours.
    I've got the 100% completed  at 11:23 uct, here some errorlog messages: (now 16.00)
    2017-03-08 11:23:42.290 spid58 Recovery completed for database dbxxxxx (database ID 5) in 7 second(s) (analysis 719 ms, redo 2040 ms, undo 1863 ms.) This is an informational message only. No user action is required.
    2017-03-08 11:23:42.150 spid58 Recovery is writing a checkpoint in database 'dbxxxxx' (5). This is an informational message only. No user action is required.
    2017-03-08 11:22:21.600 spid58 Starting up database 'dbxxxxx'.
    2017-03-08 11:20:34.110 spid58 The database 'dbxxxxx' is marked RESTORING and is in a state that does not allow recovery to be run.
    2017-03-08 11:19:36.210 spid58 Starting up database 'dbxxxxx'.
    --- Full Restore ends
    2017-03-07 18:09:28.310 Backup RESTORE DATABASE successfully processed 951509357 pages in 15081.637 seconds (492.895 MB/sec).

    What is it doing????
    resource monitor shows very low disk and cpu activity, Much less than when restore is running, also file B/sec shows low activity on db files.

    Restore session still in "executing..." on ssms, and  in  running  state from sysprocesses, without wait type or locks
    in ssms db isn't  marks as restoring/recovering, sinply  it's not accessible
    sys.databases  marks  it as ONLINE (state_desc)

    Maybe a problem with master databases?
    I noticed that if I query sys.sysdatabases instead of sys.databases i get blocked by restore spid with a LCK_M_S

    locks info:
    <Database name="master">
    <Objects>
      <Object name="sysbrickfiles" schema_name="sys">
      <Locks>
       <Lock resource_type="KEY" index_name="clst" request_mode="S" request_status="WAIT" request_count="1" />
       <Lock resource_type="OBJECT" request_mode="Sch-S" request_status="GRANT" request_count="1" />
      </Locks>
      </Object>
      <Object name="sysdbreg" schema_name="sys">
      <Locks>
       <Lock resource_type="KEY" index_name="clst" request_mode="S" request_status="GRANT" request_count="1" />
       <Lock resource_type="OBJECT" request_mode="Sch-S" request_status="GRANT" request_count="1" />
      </Locks>
      </Object>
    </Objects>
    </Database>

    any ideas?
    this is a test server, with limited resource, but this is strange.

    this isnt the first time ,  i know  it will finish  in about  10 h

    forgot: sql version 12.0.4100.1

  • What is in the log AFTER the differential restore?

    And what is your full restore command?  What about if you remove all of the optional arguments and just stick with the required ones?  Like don't worry about the max transfer size and buffer count... do you get better performance then instead of waiting for 10 hours?

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • no new entry in errorlog, differential restore still running
    this is the first rows of exec xp_readerrorlog 0,1, N'',N'',null,null,N'desc'
    2017-03-08 11:23:42.290 spid58 Recovery completed for database dbxxxxx (database ID 5) in 7 second(s) (analysis 719 ms, redo 2040 ms, undo 1863 ms.) This is an informational message only. No user action is required.
    2017-03-08 11:23:42.150 spid58 Recovery is writing a checkpoint in database 'dbxxxxx' (5). This is an informational message only. No user action is required.
    2017-03-08 11:22:21.600 spid58 Starting up database 'dbxxxxx'.
    2017-03-08 11:20:34.110 spid58 The database 'dbxxxxx' is marked RESTORING and is in a state that does not allow recovery to be run.
    2017-03-08 11:19:36.210 spid58 Starting up database 'dbxxxxx'.
    2017-03-07 18:09:28.310 Backup RESTORE DATABASE successfully processed 951509357 pages in 15081.637 seconds (492.895 MB/sec).

    no others entries

    the  full restore command is the same as differential one with "norecovery" option and the full backup file name
    RESTORE DATABASE [dbxxxxx] FROM DISK = N'x:\file.bak' WITH FILE = 1, MOVE N'dbxxxxx' TO N'x:\dbxxxxx.mdf', MOVE ....
    .... (18K+)
    NORECOVERY,
    MAXTRANSFERSIZE=2097152,
    BUFFERCOUNT=1100, STATS = 5;

    i will try  without  buffers  options, i use them in backup command to speed up backup process, it works, backup done in 2h instead of 8

    ty

  • sgt500 - Wednesday, March 8, 2017 10:05 AM

    no new entry in errorlog, differential restore still running
    this is the first rows of exec xp_readerrorlog 0,1, N'',N'',null,null,N'desc'
    2017-03-08 11:23:42.290 spid58 Recovery completed for database dbxxxxx (database ID 5) in 7 second(s) (analysis 719 ms, redo 2040 ms, undo 1863 ms.) This is an informational message only. No user action is required.
    2017-03-08 11:23:42.150 spid58 Recovery is writing a checkpoint in database 'dbxxxxx' (5). This is an informational message only. No user action is required.
    2017-03-08 11:22:21.600 spid58 Starting up database 'dbxxxxx'.
    2017-03-08 11:20:34.110 spid58 The database 'dbxxxxx' is marked RESTORING and is in a state that does not allow recovery to be run.
    2017-03-08 11:19:36.210 spid58 Starting up database 'dbxxxxx'.
    2017-03-07 18:09:28.310 Backup RESTORE DATABASE successfully processed 951509357 pages in 15081.637 seconds (492.895 MB/sec).

    no others entries

    the  full restore command is the same as differential one with "norecovery" option and the full backup file name
    RESTORE DATABASE [dbxxxxx] FROM DISK = N'x:\file.bak' WITH FILE = 1, MOVE N'dbxxxxx' TO N'x:\dbxxxxx.mdf', MOVE ....
    .... (18K+)
    NORECOVERY,
    MAXTRANSFERSIZE=2097152,
    BUFFERCOUNT=1100, STATS = 5;

    i will try  without  buffers  options, i use them in backup command to speed up backup process, it works, backup done in 2h instead of 8

    ty

    I agree that those options can improve both backup and restore times, but I am thinking it probably wouldn't hurt to test it with no optional arguments.  Are you putting "RECOVERY" in place of "NORECOVERY" in your restore?
    also what do you mean by that 18K+?  Do you have 18,000 mdf files?

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • No, just remove the norecovery option,
    restore ended afert 9.55 h with these messages:

    Msg 3009, Level 16, State 1, Line 2
    Could not insert a backup or restore history/detail record in the msdb database. This may indicate a problem with the msdb database. The backup/restore operation was still successful.
    RESTORE DATABASE successfully processed 177112062 pages in 6283.694 seconds (220.202 MB/sec).
    Msg 701, Level 17, State 123, Line 2
    There is insufficient system memory in resource pool 'default' to run this query.

    6000 seconds? really?

    the errorlogs:

    2017-03-08 19:22:58.490    Backup    RESTORE DATABASE successfully processed 177112062 pages in 6283.694 seconds (220.202 MB/sec).
    2017-03-08 19:22:38.010    Server    
    /******  I have omitted some memory dump information *******/
    ---------------------------------------- ----------
    Database             36352
    Simulated             8541
    Target             16384000
    Dirty              3024
    In IO               1
    Latched               0
    Page Life Expectancy          91
    2017-03-08 19:22:37.890    spid58    There is insufficient system memory in resource pool 'default' to run this query.
    2017-03-08 19:22:37.890    spid58    Error: 701, Severity: 17, State: 123.
    2017-03-08 19:22:37.550    spid38s    There is insufficient system memory in resource pool 'internal' to run this query.
    2017-03-08 19:22:37.550    spid38s    Error: 701, Severity: 17, State: 123.
    2017-03-08 19:22:37.390    Server    
    .........
    /****** I have omitted some memory dump information *******/
    Memory Manager            KB
    ---------------------------------------- ----------
    VM Reserved           22065692
    VM Committed           6775348
    Locked Pages Allocated          0
    Large Pages Allocated          0
    Emergency Memory           1024
    Emergency Memory In Use         16
    Target Committed          6815744
    Current Committed         6775352
    Pages Allocated          4274896
    Pages Reserved            0
    Pages Free            10016
    Pages In Use           6474968
    Page Alloc Potential          -8
    NUMA Growth Phase           0
    Last OOM Factor            1
    Last OS Error             0
    2017-03-08 19:22:37.380    Server    
    Process/System Counts         Value
    ---------------------------------------- ----------
    Available Physical Memory      83431424
    Available Virtual Memory      140714439229440
    Available Paging File       1906311168
    Working Set          6204039168
    Percent of Committed Memory in WS      89
    Page Faults           9479594
    System physical memory high        0
    System physical memory low        1
    Process physical memory low        1
    Process virtual memory low        0
    2017-03-08 19:22:37.380    Server     Failed allocate pages: FAIL_PAGE_ALLOCATION 1
    2017-03-08 19:01:41.210    Backup    Database changes were restored. Database: db_mn1, creation date(time): 2011/04/20(01:16:56), first LSN: 16842369:734463:8, last LSN: 16842369:3870416:1, number of dump devices: 1, device information: (FILE=1, TYPE=DISK: {'S:\SQL\Backup\db_mn1_D_20170307001521_3.diff'}). This is an informational message. No user action is required.
    2017-03-08 19:01:39.350    Backup    Restore is complete on database 'db_mn1'. The database is now available.
    2017-03-08 19:01:32.730    spid58    CHECKDB for database 'db_mn1' finished without errors on 2017-01-19 10:12:19.417 (local time). This is an informational message only; no user action is required.
    2017-03-08 11:23:42.290    spid58    Recovery completed for database db_mn1 (database ID 5) in 7 second(s) (analysis 719 ms, redo 2040 ms, undo 1863 ms.) This is an informational message only. No user action is required.
    2017-03-08 11:23:42.150    spid58    Recovery is writing a checkpoint in database 'db_mn1' (5). This is an informational message only. No user action is required.
    2017-03-08 11:22:21.600    spid58    Starting up database 'db_mn1'.
    2017-03-08 11:20:34.110    spid58    The database 'db_mn1' is marked RESTORING and is in a state that does not allow recovery to be run.
    2017-03-08 11:19:36.210    spid58    Starting up database 'db_mn1'.

    got some memory dump information, i didn't paste here beacuse too long

  • Well, first I would add in the RECOVERY option to the restore.

    Second, it sounds like you are out of resources to do the restore.  how much free space do you have in MSDB and on disk?  
    At the time this restore is occuring, what does your memory look like?
    Are you a member of the sysadmin group (this is required for a restore I believe)?
    Are you manually running CHECKDB on the database post restore?  It looks to me like that is what is happening and that is what is taking 8.5 hours.  I do not believe that a RESTORE DATABASE normally calls CHECKDB when it is finished, but you may have some option in there that is causing that to happen.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • > Well, first I would add in the RECOVERY option to the restore.
    I'll try the next attempt, without buffers options

    > Second, it sounds like you are out of resources to do the restore. how much free space do you have in MSDB and on disk?
    restore ends good at the end, msdb has 150mb free, datafile with autogrowth unlimited, disk free space is 3tb

    At the time this restore is occuring, what does your memory look like?
    > really underpressure, 93% (8gb ram on a vm)

    >Are you a member of the sysadmin group (this is required for a restore I believe)?
    yup

    >Are you manually running CHECKDB on the database post restore? It looks to me like that is what is happening and that is what is taking 8.5 hours. I do not believe that a RESTORE
    DATABASE normally calls CHECKDB when it is finished, but you may have some option in there that is causing that to happen.

     I was deceived by the message on the check db too the first time. In reality that message indicates only the date of the last proper execution of checkdb for that db. No checkdb by the restore command was executed

  • Right.  About the CHECKDB thing.  that threw me off.  But you have not run checkdb in quite some time, no?   2017-01-19 appears to be the last time it was run.  Might want to do that more frequently (if possible).
    But I wouldn't be surprised if putting that "RESTORE" fixes things for you.  I am pretty sure that SQL likes to have a RECOVERY or NORECOVERY option or it doesn't know what to do.  If you use the GUI (I know... GUIs are for non-DBA's and any real DBA wouldn't use the GUI for stuff they can script, but it is a nice way to double check things), the Recovery state is a required option.  And your options are RECOVERY, NORECOVERY, or STANDBY.

    I wouldn't be surprised if something was just hanging waiting for the restore to finish and it never saw it end properly.

    When you say 93%, do you mean 93% free or 93% used?  And when you say 8GB RAM on a VM, are you saying 8 GB used or 8 GB free or 8 GB total?  Since you have 3 TB of disk space, I am assuming you mean 8 GB free.  What is the max memory SQL is allowed to use and how much does that leave for the OS and other services?

    I am fairly certain the RECOVERY option will fix your issue, but just trying to figure out a secondary thing it might be... how is the log free space on msdb?  And how big is MSDB?  is 150 MB a lot of free space for you or is that 99% full?  And what is your autogrow set to?  a percentage or an actual value?  Values generally will auto-grow faster, but I know I don't like to autogrow unless it happens as a surprise (ie an infrequently used DB suddenly gets a lot of use due to some process change or a bug in someones code and it fills up much faster than it ever did before for example).

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • just launched the restore of the db with two commands, the first for full restore with NORECOVERY the second with diff and RECOVERY option
    No buffers related options added

    This is a test server, only  need  for this task and for:
    https://www.sqlservercentral.com/Forums/1862799/checkdb-Multiple-IAM-pages-for-object

    the server has only 8 GB of RAM, of which 93% occupied (now 91% without buffers option in restore command), sqlserver max memory is 6656mb.
    there 's a log of  disk space (15TB)  to  restore  this  9tb database

    msdb is very small 225MB with 150 free space autogrowth by 10%
    just run checkdb on all system db, no errors found
    in a production database i don't like autogrowth too, but this is just for testing
    Tempdb has 4x2gb files only 1 with autogrowth to check if 8gb are enaugh

    I interest me understand this behavior to avoid in case of emergency

    I will have the results tomorrow or Monday ....

  • 9 TB of data is a lot of data to restore, so it doesn't surprise me it takes a while especially on a system with that low of memory.

    I have a nearly 800 GB set of databases and restoring that is a 1.5 to 2 hour process on a good day.  Although that is a nightly full plus the hourly tlogs that I restore.

    The errors I see here:
    2017-03-08 19:22:37.890 spid58 There is insufficient system memory in resource pool 'default' to run this query.
    2017-03-08 19:22:37.890 spid58 Error: 701, Severity: 17, State: 123.
    2017-03-08 19:22:37.550 spid38s There is insufficient system memory in resource pool 'internal' to run this query.
    2017-03-08 19:22:37.550 spid38s Error: 701, Severity: 17, State: 123.

    to me make me think your memory got too full.  SPID 58 is what is doing the restore.  I am uncertain what spid 38s was doing.  But I could be mistaken, but I thought the "best practice" was to use 4 GB for the OS plus extra GB for other services and the rest for SQL (it depends on what the other services are too.  Like if you have SSIS or SSRS on the same box, I doubt that 8GB is going to be enough memory).  You are only giving the OS and other services 2 GB to work with.  That SQL instance to me looks to be memory starved and the OS likely is too.  I'd get more RAM in the VM if you can as I am fairly certain that is what is causing a lot of the slowness in your restore.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • bmg002 - Thursday, March 9, 2017 10:28 AM

    9 TB of data is a lot of data to restore, so it doesn't surprise me it takes a while especially on a system with that low of memory.

    I have a nearly 800 GB set of databases and restoring that is a 1.5 to 2 hour process on a good day.  Although that is a nightly full plus the hourly tlogs that I restore.

    The errors I see here:
    2017-03-08 19:22:37.890 spid58 There is insufficient system memory in resource pool 'default' to run this query.
    2017-03-08 19:22:37.890 spid58 Error: 701, Severity: 17, State: 123.
    2017-03-08 19:22:37.550 spid38s There is insufficient system memory in resource pool 'internal' to run this query.
    2017-03-08 19:22:37.550 spid38s Error: 701, Severity: 17, State: 123.

    to me make me think your memory got too full.  SPID 58 is what is doing the restore.  I am uncertain what spid 38s was doing.  But I could be mistaken, but I thought the "best practice" was to use 4 GB for the OS plus extra GB for other services and the rest for SQL (it depends on what the other services are too.  Like if you have SSIS or SSRS on the same box, I doubt that 8GB is going to be enough memory).  You are only giving the OS and other services 2 GB to work with.  That SQL instance to me looks to be memory starved and the OS likely is too.  I'd get more RAM in the VM if you can as I am fairly certain that is what is causing a lot of the slowness in your restore.

    the last attempt has just finished after 14h
    restore full noreco +  restore diff reco
    RESTORE DATABASE [dbxxxxx] FROM DISK = N'x:\dbxxxxx.bak' WITH FILE = 1, MOVE N....
    NORECOVERY,STATS = 5;
    RESTORE DATABASE [dbxxxxx] FROM DISK = N'x:\dbxxxxx.diff' WITH FILE = 1, MOVE N....
    RECOVERY,STATS = 5;

    at least i got a cleaner log withour errors:
    --- DIFF RESTORE (216GB COMPRESSED) DONE in about 7h
    2017-03-10 07:48:30.880    Backup    RESTORE DATABASE successfully processed 177112062 pages in 3715.398 seconds (372.419 MB/sec).
    2017-03-10 07:24:51.430    Backup    Database changes were restored. Database: dbxxxxx, creation date(time): 2011/04/20(01:16:56), first LSN: 16842369:734463:8, last LSN: 16842369:3870416:1, number of dump devices: 1, device information: (FILE=1, TYPE=DISK: {'x:\xxxxxx.diff'}). This is an informational message. No user action is required.
    2017-03-10 07:24:49.840    Backup    Restore is complete on database 'dbxxxx'. The database is now available.
    2017-03-10 07:24:45.240    spid58    CHECKDB for database 'dbxxxx' finished without errors on 2017-01-19 10:12:19.417 (local time). This is an informational message only; no user action is required.
    2017-03-10 00:00:47.130    spid37s    This instance of SQL Server has been using a process ID of 1200 since 04/02/2017 00:13:06 (local) 03/02/2017 23:13:06 (UTC). This is an informational message only; no user action is required.
    2017-03-09 23:44:59.050    spid58    Recovery completed for database dbxxxx (database ID 5) in 5 second(s) (analysis 259 ms, redo 1013 ms, undo 1894 ms.) This is an informational message only. No user action is required.
    2017-03-09 23:44:58.880    spid58    Recovery is writing a checkpoint in database 'dbxxxx' (5). This is an informational message only. No user action is required.
    2017-03-09 23:43:44.070    spid58    Starting up database 'dbxxxx'.
    2017-03-09 23:41:56.120    spid58    The database 'dbxxxx' is marked RESTORING and is in a state that does not allow recovery to be run.
    2017-03-09 23:40:58.090    spid58    Starting up database 'dbxxxx'.
    --- FULL RESTORE (1TB COMPRESSED FILE 9TB DATABASE FILES) DONE in 6h
    2017-03-09 22:31:49.120    Backup    RESTORE DATABASE successfully processed 951509357 pages in 16573.453 seconds (448.528 MB/sec).
    2017-03-09 22:06:32.330    Backup    Database was restored: Database: dbxxxx, creation date(time): 2011/04/20(01:16:56), first LSN: 16842151:11904518:1, last LSN: 16842155:13677584:1, number of dump devices: 1, device information: (FILE=1, TYPE=DISK: {'x:\xxxxxx.bak'}). Informational message. No user action required.
    2017-03-09 21:43:45.620    spid58    The database 'dbxxxx' is marked RESTORING and is in a state that does not allow recovery to be run.
    2017-03-09 21:42:46.520    spid58    Starting up database 'dbxxxx'.
    2017-03-09 17:09:36.360    spid56    Using 'xplog70.dll' version '2014.120.2000' to execute extended stored procedure 'xp_msver'. This is an informational message only; no user action is required.
    2017-03-09 17:09:36.180    spid56    Attempting to load library 'xplog70.dll' into memory. This is an informational message only. No user action is required.

    I will run it again with some traceflag on
    DBCC TRACEON(3004);
    DBCC TRACEON(3014);
    DBCC TRACEON(3604);
    DBCC TRACEON(3605);

  • I might have an idea - how many VLF's do you have?
    DBCC LogInfo(<database name>)

    If you have a crazy high number of VLF's, that will greatly affect your database restore time.  

    Failing that, It looks like you have a 9 TB database dumping data into 1 mdf file, correct?  Breaking that up into multiple mdf's across mutiple disks should improve performance.  It looks like you are averaging right around 400 MB/s between both restores (eyeballing it, and it is a little over 400, but just as a rough estimate).  So you have 9437184 MB of data that you are writing to disk at a rate of 400 MB/s plus after that is restored, you are then writing the changes to the database in.  Since this backup is roughly 1/9 the size (in number of pages), I would expect it to complete in roughly the 1/9 the time.  
    BUT if you have a large number of VLF's, the "RECOVERY" step can be slow (I believe... I could be wrong on this).
    One thing you could try is have both of the queries use "NORECOVERY" and do a:
    RESTORE DATABASE <database name> WITH RECOVERY
    after and see how long that step takes.  This will give you an idea on if it is something with bringing the database back online.

    Also, do you have instant file initialization turned on?

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • bmg002 - Friday, March 10, 2017 7:45 AM

    I might have an idea - how many VLF's do you have?
    DBCC LogInfo(<database name>)

    If you have a crazy high number of VLF's, that will greatly affect your database restore time.  

    Failing that, It looks like you have a 9 TB database dumping data into 1 mdf file, correct?  Breaking that up into multiple mdf's across mutiple disks should improve performance.  It looks like you are averaging right around 400 MB/s between both restores (eyeballing it, and it is a little over 400, but just as a rough estimate).  So you have 9437184 MB of data that you are writing to disk at a rate of 400 MB/s plus after that is restored, you are then writing the changes to the database in.  Since this backup is roughly 1/9 the size (in number of pages), I would expect it to complete in roughly the 1/9 the time.  
    BUT if you have a large number of VLF's, the "RECOVERY" step can be slow (I believe... I could be wrong on this).
    One thing you could try is have both of the queries use "NORECOVERY" and do a:
    RESTORE DATABASE <database name> WITH RECOVERY
    after and see how long that step takes.  This will give you an idea on if it is something with bringing the database back online.

    Also, do you have instant file initialization turned on?

    log file is 800gb splitted into 224 vlf, i have rebuild it and pregrown in step to reduce vlf number months ago, and instant initialization is on, that's can't apply to logfile btw.
    DB is restored into 18k+ datafile

    this is the log of last restore with some trace on, they gives me more informations:

    2017-03-10 10:55:50.120  spid58  DBCC TRACEON 3004, server process ID (SPID) 58. This is an informational message only; no user action is required.
    2017-03-10 10:55:50.240  spid58  DBCC TRACEON 3014, server process ID (SPID) 58. This is an informational message only; no user action is required.
    2017-03-10 10:55:50.250  spid58  DBCC TRACEON 3604, server process ID (SPID) 58. This is an informational message only; no user action is required.
    2017-03-10 10:55:50.250  spid58  DBCC TRACEON 3605, server process ID (SPID) 58. This is an informational message only; no user action is required.
    --- START FULL RESTORE NO RECO
    2017-03-10 10:55:50.310  spid58  RestoreDatabase: Database dbxxxx
    2017-03-10 10:55:50.440  spid58  Opening backup set
    2017-03-10 10:55:50.470  spid58  SetTargetRestoreAge: 0
    2017-03-10 10:55:55.260  spid58  Restore: Configuration section loaded
    2017-03-10 10:55:55.260  spid58  Restore: Backup set is open
    2017-03-10 10:55:55.270  spid58  Restore: Planning begins
    2017-03-10 10:57:22.920  spid58  Restore: Planning complete
    2017-03-10 10:57:22.960  spid58  Restore: BeginRestore (offline) on dbxxxx
    2017-03-10 10:57:23.010  spid58  Restore: Attached database dbxxxx as DBID=5
    2017-03-10 10:57:23.010  spid58  Restore: PreparingContainers
    2017-03-10 10:59:42.820  spid58  Zeroing X:\xxxxx\dbxxxx\dbxxxx_log.ldf from page 1 to 104857728 (0x2000 to 0xc800100000)
    2017-03-10 11:00:43.050  spid58  Restore: Containers are ready
    2017-03-10 11:00:44.070  spid58  Restore: Restoring backup set
    2017-03-10 11:00:44.070  spid58  Restore: Transferring data to dbxxxx
    2017-03-10 11:00:44.070  spid58  Starting MSDA in stream 0, thread 1844, length 0x70ea6200000
    2017-03-10 12:13:04.360  spid58  Zeroing completed on X:\xxxxx\dbxxxx\dbxxxx_log.ldf
    2017-03-10 15:34:42.440  spid58  Completed MSDA in stream 0, thread 1844
    2017-03-10 15:34:42.500  spid58  Restore: Waiting for log zero on dbxxxx
    2017-03-10 15:34:42.510  spid58  Restore: LogZero complete
    2017-03-10 15:39:58.200  spid58  SetTargetRestoreAge: 0
    2017-03-10 15:43:02.850  spid58  FileHandleCache: 0 files opened. CacheSize: 12
    2017-03-10 15:43:02.850  spid58  Restore: Data transfer complete on dbxxxx
    2017-03-10 15:43:04.560  spid58  Restore: Backup set restored
    2017-03-10 15:43:16.640  spid58  Starting up database 'dbxxxx'.
    2017-03-10 15:44:05.570  spid58  The database 'dbxxxx' is marked RESTORING and is in a state that does not allow recovery to be run.
    2017-03-10 15:44:06.250  spid58  Restore-Redo begins on database dbxxxx
    2017-03-10 15:44:17.680  spid58  Restore VLF headers for 224 headers on database dbxxxx
    2017-03-10 15:44:19.050  spid58  Restore VLF headers completed on database dbxxxx
    2017-03-10 15:44:19.180  spid58  RunOfflineRedo logIter.Init(): FirstLsn(PruId: 0): 0x100fda7:0xb5a606:0x1
    2017-03-10 15:44:19.180  spid58  RunOfflineRedo logIter.Init(): LastLsn(PruId: 0): 0x100fdab:0xd0b410:0x1
    2017-03-10 16:12:15.180  spid58  OfflineRollforward: StopLsn/LastLsn(PruId: 0): 0x100fdab:0xd0b410:0x1
    2017-03-10 16:13:36.410  spid58  FlushCache: cleaned up 481037 bufs with 204407 writes in 81167 ms (avoided 0 new dirty bufs) for db 5:0
    2017-03-10 16:13:36.410  spid58   average writes per second: 2518.35 writes/sec
      average throughput: 46.30 MB/sec, I/O saturation: 65599, context switches 62078
    2017-03-10 16:13:36.410  spid58   last target outstanding: 216, avgWriteLatency 0
    2017-03-10 16:15:26.620  spid58  Rollforward complete on database dbxxxx
    2017-03-10 16:15:28.150  spid58  Restore: Done with fixups
    2017-03-10 16:15:29.530  spid58  Restore: Writing history records
    2017-03-10 16:15:29.530  Backup  Database was restored: Database: dbxxxx, creation date(time): 2011/04/20(01:16:56), first LSN: 16842151:11904518:1, last LSN: 16842155:13677584:1, number of dump devices: 1, device information: (FILE=1, TYPE=DISK: {'X:\xxxxx\dbxxxx_F_20170304001517_7.bak'}). Informational message. No user action required.
    2017-03-10 16:17:29.580  spid58  Writing backup history records
    2017-03-10 16:40:44.840  spid58  Restore: Done with MSDB maintenance
    2017-03-10 16:40:44.850  Backup  RESTORE DATABASE successfully processed 951509357 pages in 16938.343 seconds (438.866 MB/sec).
    2017-03-10 16:40:44.900  spid58  RestoreDatabase: Finished
    2017-03-10 16:40:44.900  spid58  RestoreDatabase: Database dbxxxx
    2017-03-10 16:40:44.920  spid58  X-locking database: dbxxxx
    -- RESTORE DIFF BACKUP RECO
    2017-03-10 16:40:45.010  spid58  Opening backup set
    2017-03-10 16:40:45.040  spid58  SetTargetRestoreAge: 0
    2017-03-10 16:40:49.720  spid58  Restore: Configuration section loaded
    2017-03-10 16:40:49.720  spid58  Restore: Backup set is open
    2017-03-10 16:40:49.730  spid58  Restore: Planning begins
    2017-03-10 16:43:21.450  spid58  Dismounting FullText catalogs
    2017-03-10 16:43:21.450  spid58  Restore: Planning complete
    2017-03-10 16:43:21.450  spid58  Restore: BeginRestore (offline) on dbxxxx
    2017-03-10 16:43:33.260  spid58  Restore: PreparingContainers
    2017-03-10 16:47:38.490  spid58  Zeroing X:\xxxxx\dbxxxx\dbxxxx_log.ldf from page 1 to 104857728 (0x2000 to 0xc800100000)
    2017-03-10 16:47:38.890  spid58  Restore: Containers are ready
    2017-03-10 16:47:39.660  spid58  Restore: Restoring backup set
    2017-03-10 16:47:39.660  spid58  Restore: Transferring data to dbxxxx
    2017-03-10 16:47:39.660  spid58  Starting MSDA in stream 0, thread 1844, length 0x15170e00000
    2017-03-10 17:55:09.550  spid58  Zeroing completed on X:\xxxxx\dbxxxx\dbxxxx_log.ldf
    2017-03-10 18:03:35.960  spid58  Completed MSDA in stream 0, thread 1844
    2017-03-10 18:03:35.960  spid58  Restore: Waiting for log zero on dbxxxx
    2017-03-10 18:03:35.960  spid58  Restore: LogZero complete
    2017-03-10 18:03:50.780  spid58  SetTargetRestoreAge: 0
    2017-03-10 18:06:58.730  spid58  FileHandleCache: 0 files opened. CacheSize: 12
    2017-03-10 18:06:58.730  spid58  Restore: Data transfer complete on dbxxxx
    2017-03-10 18:06:59.910  spid58  Restore: Backup set restored
    2017-03-10 18:07:11.570  spid58  Starting up database 'dbxxxx'.
    2017-03-10 18:07:59.980  spid58  The database 'dbxxxx' is marked RESTORING and is in a state that does not allow recovery to be run.
    2017-03-10 18:08:00.690  spid58  Restore-Redo begins on database dbxxxx
    2017-03-10 18:08:11.460  spid58  Restore VLF headers for 224 headers on database dbxxxx
    2017-03-10 18:08:12.650  spid58  Restore VLF headers completed on database dbxxxx
    2017-03-10 18:08:12.750  spid58  RunOfflineRedo logIter.Init(): FirstLsn(PruId: 0): 0x100fe81:0xb34ff:0x8
    2017-03-10 18:08:12.750  spid58  RunOfflineRedo logIter.Init(): LastLsn(PruId: 0): 0x100fe81:0x3b0ed0:0x1
    2017-03-10 18:09:06.420  spid58  OfflineRollforward: StopLsn/LastLsn(PruId: 0): 0x100fe81:0x3b0ed0:0x1
    2017-03-10 18:09:51.060  spid58  Rollforward complete on database dbxxxx
    2017-03-10 18:09:52.620  spid58  Restore: Done with fixups
    2017-03-10 18:09:52.650  spid58  Restore: Transitioning database to ONLINE
    2017-03-10 18:09:52.920  spid58  Restore: Restarting database for ONLINE
    2017-03-10 18:10:05.590  spid58  Starting up database 'dbxxxx'.
    2017-03-10 18:11:09.030  spid58  Zeroing X:\xxxxx\dbxxxx\dbxxxx_log.ldf from page 66302317 to 66302797 (0x7e762da000 to 0x7e7669a000)
    2017-03-10 18:11:09.040  spid58  Zeroing completed on X:\xxxxx\dbxxxx\dbxxxx_log.ldf
    2017-03-10 18:11:13.470  spid58  Recovery is writing a checkpoint in database 'dbxxxx' (5). This is an informational message only. No user action is required.
    ----- ???????? -----
    2017-03-10 18:11:13.600  spid58  Recovery completed for database dbxxxx (database ID 5) in 5 second(s) (analysis 233 ms, redo 1127 ms, undo 1932 ms.) This is an informational message only. No user action is required.
    2017-03-11 00:00:08.000  spid33s  This instance of SQL Server has been using a process ID of 1200 since 04/02/2017 00:13:06 (local) 03/02/2017 23:13:06 (UTC). This is an informational message only; no user action is required.
    2017-03-11 01:47:43.050  spid58  PostRestoreContainerFixups: fixups complete
    ----- ???????? -----

    2017-03-11 01:47:43.850  spid58  CHECKDB for database 'dbxxxx' finished without errors on 2017-01-19 10:12:19.417 (local time). This is an informational message only; no user action is required.
    2017-03-11 01:47:43.860  spid58  PostRestoreReplicationFixup for dbxxxx starts
    2017-03-11 01:47:48.470  spid58  PostRestoreReplicationFixup for dbxxxx complete
    2017-03-11 01:47:48.790  spid58  Restore: Database is restarted
    2017-03-11 01:47:48.790  Backup  Restore is complete on database 'dbxxxx'. The database is now available.
    2017-03-11 01:47:49.410  spid58  Resuming any halted fulltext crawls
    2017-03-11 01:47:50.110  spid58  Restore: Writing history records
    2017-03-11 01:47:50.110  Backup  Database changes were restored. Database: dbxxxx, creation date(time): 2011/04/20(01:16:56), first LSN: 16842369:734463:8, last LSN: 16842369:3870416:1, number of dump devices: 1, device information: (FILE=1, TYPE=DISK: {'X:\xxxxx\dbxxxx_D_20170307001521_3.diff'}). This is an informational message. No user action is required.
    2017-03-11 01:49:37.730  spid58  Writing backup history records
    2017-03-11 02:11:33.170  spid58  Restore: Done with MSDB maintenance
    2017-03-11 02:11:33.180  Backup  RESTORE DATABASE successfully processed 177112062 pages in 4758.936 seconds (290.755 MB/sec).
    2017-03-11 02:11:33.220  spid58  RestoreDatabase: Finished
    2017-03-12 00:00:36.860  spid27s  This instance of SQL Server has been using a process ID of 1200 since 04/02/2017 00:13:06 (local) 03/02/2017 23:13:06 (UTC). This is an informational message only; no user action is required.
    2017-03-13 00:00:05.580  spid30s  This instance of SQL Server has been using a process ID of 1200 since 04/02/2017 00:13:06 (local) 03/02/2017 23:13:06 (UTC). This is an informational message only; no user action is required.

    what's "PostRestoreContainerFixups: fixups complete"? it seems that this step is to take a long time, for now I haven't found anything interesting about it
    iI thought it might be something related to the log file (es zeroing), in reality all the steps related to it end up in a reasonable timeframe

    ps: compressed backup runs in about 2.45 h / 2h, 650-750mb/s, but on a different san

  • sgt500 - Wednesday, March 8, 2017 7:57 AM

    MAXTRANSFERSIZE=2097152,
    BUFFERCOUNT=1100, 

    Remove the above and just go with the default transfersize and buffercount

    -----------------------------------------------------------------------------------------------------------

    "Ya can't make an omelette without breaking just a few eggs" 😉

  • Perry Whittle - Monday, March 13, 2017 6:17 AM

    sgt500 - Wednesday, March 8, 2017 7:57 AM

    MAXTRANSFERSIZE=2097152,
    BUFFERCOUNT=1100, 

    Remove the above and just go with the default transfersize and buffercount

    Hi Perry,
    ll subsequent attempts, including the last, have been made without
    restore ends without errors, but i can't explain the 6.30h between recovery complete and postrestorefixups:
    2017-03-10 18:11:13.600 spid58 Recovery completed for database dbxxxx (database ID 5) in 5 second(s) (analysis 233 ms, redo 1127 ms, undo 1932 ms.) This is an informational message only. No user action is required.
    2017-03-11 00:00:08.000 spid33s This instance of SQL Server has been using a process ID of 1200 since 04/02/2017 00:13:06 (local) 03/02/2017 23:13:06 (UTC). This is an informational message only; no user action is required.
    2017-03-11 01:47:43.050 spid58 PostRestoreContainerFixups: fixups complete

    what about this step?

Viewing 15 posts - 1 through 15 (of 32 total)

You must be logged in to reply to this topic. Login to reply