Investigating Slow Restore Process

  • I have a SQL Server 2014 instance. I am restoring native backups that reside on a share on a virtual machine over to a physical machine. The size of the back up is about 130 GB and the database is in the neighborhood of 700 GB after it has been restored. For some reason it is taking almost a full 24 hours to do a restore. The two machines are on the same network and subnet.

    I can also tell you that just transferring files between these two machines is extremely slow and most file transfers between any machines on the network are slow. I'm concerned for a number of reasons, but mainly because this issue will hurt our ability to recover quickly, and currently, it makes it painful to refresh non-prod from a prod backup.

    I don't have a Windows or Network admin at my job and I am hoping I can get some help here. What do I need to check or what things do I need to consider that might be causing this issue?

  • There are several things that pop into my mind.

    1. If the database backup isn't a compressed backup, then you have some serious file size and freespace issues if the backup is only 130GB and the restored database is 700GB. When is the last time you did a full backup and a transaction log backup? How big are the MDF and LDF files? When is the last time you did index maintenance?

    2. What speed is the network between the two boxes? If the comms are setup for "auto-negotiate", that could be a problem because it frequently choses incorrectly. Are all the network cards and switches/routers setup for the same speed? Also, the comms should be setup for FULL DUPLEX. None of the cables should be over 20 feet and no cable should have any sharp bends in them. I try to never use less than a 3" radius bend and prefer a 6" radius.

    3. Does the target machine have a Virus Scan software on it? If so, have MDF and LDF files been properly excluded?

    4. Is anything else going on on the target machine?

    5. Is instant file initialization enabled?

    6. What is the speed of the target disks?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • If the file transfer speed is slow, that could explain the entire issue. You might want to keep a local copy of the backup for DR situations. Otherwise, figure out what's up with your network that's causing the slow transfer speed.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Apart from the network issue, which needs to be investigated, how big is the log file and what are the VLF counts on it?

    This would also contribute to your recovery times if you have a badly fragmented log file.

    Agree in checking if instant file initialization is working for you on the server too, but will only assist with the data files being restored, not the log file.

  • Jeff Moden (7/26/2015)


    There are several things that pop into my mind.

    1. If the database backup isn't a compressed backup, then you have some serious file size and freespace issues if the backup is only 130GB and the restored database is 700GB. When is the last time you did a full backup and a transaction log backup? How big are the MDF and LDF files? When is the last time you did index maintenance?

    2. What speed is the network between the two boxes? If the comms are setup for "auto-negotiate", that could be a problem because it frequently choses incorrectly. Are all the network cards and switches/routers setup for the same speed? Also, the comms should be setup for FULL DUPLEX. None of the cables should be over 20 feet and no cable should have any sharp bends in them. I try to never use less than a 3" radius bend and prefer a 6" radius.

    3. Does the target machine have a Virus Scan software on it? If so, have MDF and LDF files been properly excluded?

    4. Is anything else going on on the target machine?

    5. Is instant file initialization enabled?

    6. What is the speed of the target disks?

    Database compression is turned on at the instance level. The MDF is 653 GB and the LDF is 70 GB. Index maintenance happens every night and uses logic comparable to Olla. There is no anti-virus running on the SQL instance. In terms of the network all I can tell you right now is that the setting "Speed and Duplex" on the NICs on the source box are set to auto-negotiate and Jumbo Mtu is at 1500. I thought IFI was enabled but it is not. I made the changes and will have to reboot during a maintenance window to enable that. The target disk is a Fusion IO so it is fast.

  • Justin Manning SA (7/27/2015)


    Apart from the network issue, which needs to be investigated, how big is the log file and what are the VLF counts on it?

    This would also contribute to your recovery times if you have a badly fragmented log file.

    Agree in checking if instant file initialization is working for you on the server too, but will only assist with the data files being restored, not the log file.

    Over the weekend I did some VLF cleanup and shaved about 150 VLF's from the database I am trying to restore. There are still too many but I will have to try again to reduce them. I did make changes to auto-growth based on a couple of articles by Kimberly Tripp.

  • Lack of IFI and slow network will kill your restore speed. Curious though, how long does it take to run a backup?

    You should also see a pretty substantial restore time with IFI on.

    If your network is slow and there's nothing you can do about it, try to come up with an alternative solution or workarounds. If you have the capacity, can you create two backups? A backup to a share and a copy of that backup to the vm where the file is going to be restored?

    Can you robocopy from the share to the other system? So that when you're ready to restore it's already moved over and should be much faster.

    Can you set up log shipping instead?

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply