Backups Across Network Too Slow

  • More issues with networking, I'm afraid. We are experiencing performance issues with the backups on a couple of the database servers.

    Our backup plan is to back up databases from all of our server (several of them) across a network to two dedicated servers with big, beefy tape drives. The backups themselves are set up through Enterprise Manager and run across the network to disk on the backup servers. Then the network guys run the backup files to tape, using CA's 'ARCServe' program.

    One of these servers (a clustered server, by the way), only succeeds in backing up very small databases. Any database much larger than 100 MB just refuses to go to the backup server. The interesting thing is that it makes the connection to the other server and creates a placeholder file there, so it isn't a permissions issue. It actually starts churning out a backup, but then just halts and sends the following nastygram into the error log:

    quote:


    Internal I/O request 0x06E1CC38: Op: Write, pBuffer: 0x077A0000, Size: 983040, Position: 27204096, UMS: Internal: 0x0, InternalHigh: 0xF0000, Offset: 0x19F1A00, OffsetHigh: 0x0, m_buf: 0x077A0000, m_len: 983040, m_actualBytes: 0, m_errcode: 121, BackupFile: \\BACKUPSERV1\G$\MSSQL2000\BACKUP\FULLDB\DBCLUSTER2\train\train_db_200308041154.BAK

    BackupDiskFile::RequestDurableMedia: failure on backup device '\\BACKUPSERV1\G$\MSSQL2000\BACKUP\FULLDB\DBCLUSTER2\train\train_db_200308041154.BAK'. Operating system error 64(The specified network name is no longer available.).


    Doesn't mean a thing to me. Doesn't seem to mean much to the network and system admin guys, either, since they have been unable to fix it from their end. In my mind, I have almost completely absolved the DBMS from any foul play in this, but lingering doubt forces me to see if anyone out there knows what any of this might mean.

    On two other (non-clustered) database servers, I don't actually get an error. It just takes for-frigging-ever to back up even a small database to the backup servers. About two hours for a 100 MB database. I think walking the data out the front door, a byte at a time, might even be faster. I don't know what else to do except plaintively wail to the system & network guys.

    On all my other servers, about seven of them, I have no difficulties running backups across the network. We have (otherwise) a very fast backbone, and they run very quickly. On none of my servers do I have trouble running a backup to disk on the host box.

    Like I said, these things strike me as system or network issues. If I'm wrong, and there is something I might do at my end to help things along, I'd be most appreciative for any salient advice.

  • It could be network issue according to the error message "Operating system error 64(The specified network name is no longer available.)" or could also be the administrative share G$ disappeared. I have see this in one of my NT 4 server.

    I would suggest to create a no-administrative share and try the backup. For example, create backup folder in your backup server and share that folder as \\backupdb and run commamd like 'backup database train to disk = '\\BACKUPSERV1\backupdb\MSSQL2000\BACKUP\FULLDB\DBCLUSTER2\train\train_db_200308041154.BAK'

    If you can create second backup folder in separate disk RAID in your backup server and perform the database backup to both backup folder, The backup performance could imporve.

    During the backup, ask your network administrator to monitor whether you have network issue and bottleneck.

  • Lee,

    we had similar problems, but now we use a product called sqllitespeed (www.sqllitespeed.com) thats advertised on this site. This fixed the problems we were having and we now backup 120GB on the network, no problems. we then use tivoli to pick up those backups.

    declan

  • I would also suggest you check your network hardware for bottleneccks, especially the network interface cards. I had a similar problem where a 562GB database backup was failing(with same error message you are having) or taking over 72 hours to complete. Replacing the NIC solves the problem and backup now completes in under four hours.


    Joseph

  • Edgewood Solutions is a reseller of SQL LiteSpeed and we can assist you with any questions you may have about SQL LiteSpeed. You can call us at 888-788-2444 or register for a free eval version at http://www.edgewoodsolutions.com/litespeedeval.asp

    Thanks

    Greg Robidoux

    Edgewood Solutions

    http://www.edgewoodsolutions.com


    Greg Robidoux
    Edgewood Solutions
    www.edgewoodsolutions.com

  • Personally I would never backup directly to a network share, admin or otherwise.

    Backup locally, then either xcopy, or have the backup program read from a share on the db server (not prefered either, but works).

    Consider the amount of time backing up locally, vs. backing up to a remote share.

    Yes, the DB server needs a few extra drives, dasd is cheap today.

    This could also save an equal amount of time if a restore is required. Consider the time in reverse.

    KlK, MCSE


    KlK

  • quote:


    Personally I would never backup directly to a network share, admin or otherwise.


    Okay, I'll bite: Why not? We have a very large server with tons of disk space devoted to backups and a very fast backbone. By backing up across the network, I allow my database servers to retain their disk for other purposes. Really, if speed is the issue, it's not that much slower than backing up right to the box.

    Edited by - Lee Dise on 08/11/2003 10:47:38 AM

  • Well if the backbone is fast enough, maybe even dedicated nics to the backup server ok.

    it introduces another variable to deal with.

    Some network overhead,

    And I just like having a current copy local, yes we sweep them off ASAP with TSM.

    I would be curious for a timing comparison, though.

    KlK, MCSE


    KlK

  • quote:


    Well if the backbone is fast enough, maybe even dedicated nics to the backup server ok.... I would be curious for a timing comparison, though.


    I did some timings before I switched to this method. It is definitely slower going across the backbone; you're right about that. It probably takes, if not twice as long, somewhere in that ball park. But some portion of the erstwhile time savings would have been spent copying the file over to the backup server, anyway, not to mention the extra disk that gets chewed up, and all the trouble of rigging up automated copying.

    If we had very large databases that needed all night just to back up to the host box, I would agree with you. In our case, however, our biggest backup file barely skirts 38 GB. (We have a whole bunch of them, but no one of them is really big by modern standards.) This allows the convenience factor to outweigh the performance factor, at least in my judgment.

    When it's working, that is.

  • Just an addition to all.... NIC card is probably the culprit and 100 MBPS NIC's capability is to transfer data is maximum 44 GB an hour.. see if that's the issue...

    I do favour backing up at local and then transfer..

    I case of cluster we do backup on quorum drive and just point that drive to other node...

    Cheers..

    Prakash Heda

    Sr Consultant DBA

    Bangalore, India

    Chat: hedafriends@yahoo.com

    Prakash Heda
    Lead DBA Team - www.sqlfeatures.com
    Video sessions on Performance Tuning and SQL 2012 HA

  • Hi there

    Just a quick one - are you sure your running full duplex mode? (ie. 100Mbps?)

    Also, did you try running netmon.exe to locate networking errors?

    Cheers

    Ck

    Chris Kempster

    http://www.chriskempster.com

    Author of "SQL Server 2k for the Oracle DBA"


    Chris Kempster
    www.chriskempster.com
    Author of "SQL Server Backup, Recovery & Troubleshooting"
    Author of "SQL Server 2k for the Oracle DBA"

  • ckempste reminded me about one of the problems occured few months earlier.

    We had somewhat similar issue with NIC on few of our terminal servers (not Database Servers). Someone changed the NIC properties to AUTO mode. If its in AUTO mode, the speed at which clients connect to that server will be fluctuating.

    If its FIXED (100MBPS FULL-DUPLEX), the speed at which it connect to network will be fixed.

    We had it changed to FIXED again.

    .

  • quote:


    By backing up across the network, I allow my database servers to retain their disk for other purposes. Really, if speed is the issue, it's not that much slower than backing up right to the box.


    As KIK has pointed out, you introduce another variable. Which is more important, getting a successful backup or saving temporary disk space? Once you get the backup, you copy it to another system. Verify the copy. Then delete the backup on the existing system. A few more cycles, more on the temporary storage, but you'll get the backup. Now verifying the backup is a whole different matter...

    K. Brian Kelley

    http://www.truthsolutions.com/

    Author: Start to Finish Guide to SQL Server Performance Monitoring

    http://www.netimpress.com/

    K. Brian Kelley
    @kbriankelley

  • quote:


    As KIK has pointed out, you introduce another variable.


    <shrug> We add new variables all the time in this business, don't we? The issue isn't how many variables are involved, but rather how reliable the method is from experience.

    Brave or foolish of me to argue with such high-powered SQL Server DBAs, but what the heck (so long as I do it politely) .

    I'm currently running backups, across network paths, for about seven different servers. The only one causing an issue would still be an issue if the backups were run locally to disk and then copied to another server. This is because, as it turns out, we experience the same issues even just copying larger files from the offending box to either of our backup servers. This is definitely looking like a network issue.

    quote:


    Which is more important, getting a successful backup or saving temporary disk space?


    The issue here is whether to backup databases to local disk or across a network to another server. As the trade-off is presented above, why, a successful backup is more important, of course. But that isn't really an accurate presentation of the trade-offs here. We get plenty of successful backups running across the network, every day. Everything can be managed via a database maintenance plan, which keeps things nice and easy.

    Furthermore, doing things locally has trade-offs, too. There have been failures backing up to local servers. We have run out of space on the local servers at various times, which terminates the backups abnormally. We cannot always control how much space is available on any given server on any given evening, nor can we snap our fingers and make the gods of system resources dispense more disk.

    I assume DTS would allow something like the following, which would probably be the optimum approach as advocated by others in this thread:

    FOR EACH DATABASE ON THE SERVER

    BEGIN

    BACKUP DATABASE TO LOCAL SERVER DISK

    COPY DATABASE BACKUP FILE TO BACKUP SERVER

    DELETE LOCAL BACKUP FILE

    END

    Of course, this is not the way the database maintenance plans work, which is more like this:

    FOR EACH DATABASE IN THE PLAN

    BEGIN

    BACKUP DATABASE

    END

    FOR EACH DATABASE IN THE PLAN

    BEGIN

    REMOVE OUT-OF-DATE BACKUP FILES

    END

    If you use database maintenance plans to back up to local disk, it backs up all databases, one after another, and only then frees up space by deleting any expired datasets. Using a database maintenance plan, there would need to be at least enough local disk space to hold at least one of each database backup. On my servers, this translates to well over 100 GB, and I cannot always count on having that much space to work with.

    We could, following the recommendations herein, eschew the database maintenance plan methodology, write our own backup scripts (using TSQL with DTS, NT Shell Script, VB, whatever) to implement the preferred approach... but that would be adding variables, right? Adding code is adding variables, too.

    I argue with you on this not out of hubris but in all humility. You may be absolutely right about your approach, and if so I hope you can convince me.

    Edited by - Lee Dise on 08/28/2003 08:24:10 AM

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply