A full backup of a database is taking an abnormally long amount of time to complete

  • Hello --

    We are running SQL Server 2008 R2 64-bit on a Windows 2012 R2 Standard 64-bit system. Our backup plan utilizes the MaintenanceSolutions.sql scripts. The Database Backup = USER_DATABASES - FULL job involves twenty databases of sizes ranging from twenty megabytes to over 300 gigabytes. All backups are on an iSCSI mounted volume.

    Two of the largest databases were recently reconfigured to utilize a separate FILESTREAM volume from the others due to space limitations of the original volume. Both were interactively backed up to a temporary folder on another volume. The databases were then deleted, and subsequently restored with the new FILESTREAM volume configured in their options. The databases were successfully accessed by our users after the reconfiguration of each had been complete.

    The larger of the two databases, which is 360 gigabytes, completes backing up in about three hours. However the other database, which is 326 gigabytes, is taking a day and a half to complete. I compared the properties of the two databases, and I did not see anything different in their configurations.

    The DatabaseIntegrityCheck -USER_DATABASES and IndexOptimize - USER_DATABASES job histories did not list any problems. The backups of all the other databases are not taking nearly as long to complete.

    I am going to be interactively backing up the problematic database this evening to see if the issue is within the backup job, or the database itself. Beyond that, what other steps can I take to troubleshoot and correct this issue?

  • I would recommend looking at disk performance on both ends and connectivity.

    I'm backing up a 600GB database in 12 - 14 minutes using SQL Native.

    Check the following: Are the data drives formatted with 4K or 64K allocation units?

    In CMD Prompt run >fsutil fsinfo ntfsinfo X: Where X is the data drive letter

    Check the value for the "Bytes per Cluster" value, you want this to be 65536 (not 4096), and you should get some performance improvements.

    Changing this is a pain as you need to add drives, stop SQL, copy the files, rename drives and start SQL, but it's worth it and you may get general performance improvements as well.

    Try splitting the backup into multiple output files. It's not quite linear, but 2 files should close to half your backup time provided your disks and connectivity can handle the IOs.

    Cheers

    Leo

    Nothing is ever so complicated that with a little bit of work it can't be made more complicated.

    Leo
    Nothing in life is ever so complicated that with a little work it can't be made more complicated.

  • Hello --

    Thank-you for your reply. I checked the volume in question, and it does have the value suggested in your posting. The problem appears to be more than just one database taking a long time to complete. The backup job started on June 13 at 10:00 in the morning. I did a check of the backup this morning, and it is still running with three more databases to be completed by the job. Listed below are the times that each database has finished:

    DATABASE 1 - 6/13/15 - 10:44 AM 15.1 GB

    DATABASE 2 - 6/13/15 - 3:01 PM 76.1 GB

    DATABASE 3 - 6/13/15 - 7:45 PM 79.5 GB

    DATABASE 4 - 6/13/15 - 8:37 PM 333 GB

    DATABASE 5 - 6/16/15 - 3:12 AM 368 GB

    DATABASE 6 - 6/16/15 - 6:47 AM 46 GB

    DATABASE 7 - 6/16/15 - 10:23 PM 166 GB

    DATABASE 8 - 6/16/15 - 11:19 PM 12.9 GB

    DATABASE 9 - 6/17/15 - 5:24 AM 77.7 GB

    DATABASE 10 - 6/17/15 - 6:08 AM 7.09 GB

    DATABASE 11 - 6/17/15 - 6:36 AM 5.35 GB

    The database server itself has been up for over 151 days. It is also possible the port(s) on the switch to which the server is connected might be bad.

    There are two courses of action that I was planning on doing:

    1. Reboot the server.

    2. Trace the network cables of the server to the network switch, and move them to new ports.

    Is there anything else that I should consider?

  • Hello --

    I ran Performance Monitor last night, and generated a blg file. The report included the following information:

    Domain network adapter: Bytes received/sec - 33,111.117

    Bytes sent/sec - 3,566,988.769

    Bytes Total/sec - 3,600,099.886

    Current Bandwidth - 1,000,000,000

    Output Queue Length - 0.000

    10 GbE network adapter: Bytes received/sec - 3,419,435.809

    Bytes sent/sec - 10,902.127

    Bytes Total/sec - 3,430,337.936

    Current Bandwidth - 10,000,000,000.0000

    Output Queue Length - 0.000

    10 GbE network adapter 2: Bytes received/sec - 1,028.571

    Bytes sent/sec - 1,115.982

    Bytes Total/sec - 2,144.552

    Current Bandwidth - 10,000,000,000.0000

    Output Queue Length - 0.000

    Physical Disk:

    % Disk Time 14.947

    % Idle Time 85.572

    Avg. Disk Read Queue Length 0.013

    Avg. Disk sec/Read 0.004

    Avg. Disk sec/Write 0.187

    Avg. Disk Write Queue Length 1.033

    Disk Reads/sec 3.684

    Disk Transfers/sec 9.380

    Disk Writes/sec 5.696

    It appears there is a problem with the second ten gigabit adapter based on the stark contrast between it, and its twin.

    I am not sure how to interpret the physical disk results.

    Feedback, especially on the latter would be appreciated.

    Thanks.

  • Hello -–

    I did a check of the hardware involved in this situation: server nics, and switches, and there was nothing to indicate the problem

    was hardware related.

    I had another question concerning the backup. Instead of having a job that would do all the databases in a single full backup,

    what about creating several jobs that would run simultaneously? The larger databases could be divided among the jobs, and

    perhaps that would help speed things up? This could also be done with the differential backups.

    Anyone's thoughts on this?

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply