Has anyone used Data Domain storage for SQL 2008 backups?

  • We are creating a Microsoft Failover Cluster with Windows Server 2008 and SQL Server 2008. Our storage administrator would like the DBAs to use Data Domain storage (http://www.datadomain.com/) for our disk based backup device. The backups will likely be slower, but we should have lots of disk space to keep the backups online. In essence, he will be creating us a file share on the Data Domain box. I have heard that SQL backups on a cluster do not like to write to a file share. Does anyone have experience with Data Domain in their environment? Is Data Domain storage feasible for use with a cluster?

  • Not sure about the clustering part, but DataDomain has a "guide" for backing up to a CIFS share with SQL Server. Granted, the guide informs you to pass the username and password in clear text using xp_cmdshell (if DataDomain is not set up for AD authentication), but it does work. Here's what they have in their guide below. I don't have a link to the PDF or I would post it as my storage consultant emailed this to me.

    Configuration and Scripting in Workgroup Mode

    To communicate with a Data Domain system in workgroup mode, you need to turn

    on xp_cmdshell, which is disabled by default. To do so, run the following

    commands:

    EXECUTE sp_configure 'show advanced options', 1

    RECONFIGURE WITH OVERRIDE

    GO

    EXECUTE sp_configure 'xp_cmdshell', '1'

    RECONFIGURE WITH OVERRIDE

    GO

    EXECUTE sp_configure 'show advanced options', 0

    RECONFIGURE WITH OVERRIDE

    GO

    Example Backup Script

    EXEC xp_cmdshell 'net use k: \\dd510-3\backup abc123 /user:dd510-3\sysadmin'

    backup database lori_test1 to disk='k:\lori_sql_test\lori_backup.dmp'

    EXEC xp_cmdshell 'net use k: /d'

    Example Restore Script

    ALTER DATABASE lori_test1 SET RECOVERY SIMPLE

    alter database lori_test1 set single_user

    drop database lori_test1

    EXEC xp_cmdshell 'net use k: \\dd510-3\backup abc123 /user:dd510-3\sysadmin'

    RESTORE DATABASE lori_test1 FROM DISK ='k:\lori_sql_test\lori_backup.dmp' WITH

    FILE=1, NORECOVERY;

    EXEC xp_cmdshell 'net use k: /d'

    I personally don't use this as I am still investigating better methods. I see you posted this a while back, so you probably already figured out a workaround. If you have a better way, please share as I am not familiar with DataDomain and my company will not turn on AD authentication for it. Don't ask - they love their Novell and don't want to risk breaking it. If you ask me, it's broken upon installation.

    You can also set the CIFS option in DataDomain to disable anonymous access, but that's just as bad as passing the credentials in clear text. Hope this helps.

  • Thanks for the feedback. This has been a nightmare. The Data Domain vendor sent me a PDF file which said that its product can be easily integrated with Active Directory (which is what we want in order to avoid hardcoding usernames and passwords in our backup scripts). Several Data Domain upgrades later, we still have issues.

  • Has anyone else implemented Data Domain for their SQL backups? We are installing DD sometime in Q1 of 2013 and would like to know how others have managed this. Use native SQL compression or not? Use the EMC Networker software? Backup to plain file share or use Dedupe storage? We are currently using Ola Hallengren's maintenance scripts to backup natively to a network file share and feel most comfortable continuing with that solution. I would appreciate any guidance.

  • We eventually got our SQL Server database backups working with the Data Domain. (Outside of the scope of the original post, we also eventualy got our Oracle database backups working with the Data Domain as well.) Although the product literature (and Data Domain Tech Support) led us to believe that using Active Directory would be "easy," it was not. Tech Support was clueless! Our Disk Admins finally figured it out after *months* of conference phone calls, patches, and Data Domain upgrades. Unfortunately, I have no idea what they ultimately changed to integrate Data Domain security with Active Directory security.

    Once the CIFS share was configured properly to work with our Active Directory, we were able to route our SQL Server backups to it. We run our backups from PowerShell, using sqlcmd.exe, scheduled under Task Scheduler. All of our databases are encrypted (which causes our backups to be encrypted as well); so, compression does not really save us anything. While we were trying to get the Data Domain to work, we routed our backups to a Windows file share, and the Windows file share consistently performed faster than the Data Domain CIFS share. De-dupe does not seem to save us much space because of the encrypted database backups.

    We have been using Data Domain for roughly two years. We have had to revert back to the Windows file share a *number* of times because of failed patches, upgrades, etc. When it works, Data Domain works reasonably well. When it doesn't work, it usually doesn't work for a week or two at a time. If it were left up to me, I'd just use a Windows file share because it's easier to debug when something breaks.

  • We are looking into EMC Data Domain, but I am a bit worried about backup performance.

    Brent Ozar has written a blot entry about "Dedupe[/url]", and I do share his opinion.

    /Niels Grove-Rasmussen

  • I've utilized data domain previously.. I'm trying to refresh my memory now.. I know we also had similar issues with Active Directory i.e. our Disk Admins did. You also need Domain accounts running SQL services in order to write to data domain and then there is a way they add the accounts to be able to write to the share. We replaced Networker and Tape Backups at that time with Data Domain. So after a period of time backups were gone period.

    We compressed backups since we had third party software to compress backups and save space.

    One thing I'd say is you may not be able to know the space remaining. I know this is not our decision to go to data domain, but still we have to deal with the space issues as they may arise.

    The other notable thing is speed in writing to data domain and load. Backups from all databases, SQL, Oracle or otherwise, files, Server backups etc constantly go to data domain and we may not know enough. We tried to stagger backups, and make sure atleast the biggest databases are not being backed up at the same time. So I tried to have a each server start one hour after the other. (I changed fulls for test/dev servers to Monday/Tuesday since, on weekends many full backups take place and typically only after they are written, older ones are deleted, so space consumption might be higher and then scale back.)

    The good thing I saw was our disk admins were able to replicate between datacenters. So we could copy backups across datacenters whenever required.(DR and what not). Also, it can be handy to have access to all backups.

    It could be a big deal to have everything duplicated. We only used to copy required backups across.

  • We use Data Domain for backup archiving. It is 1/2 mile away at another site.

    I back up all my databases to the actual server that it resides on. I keep two days' worth of backups on the server (space permitting). I wrote a program that takes two directories as a parameter, and it copies all the files in the first directory to the second directory (the Data Domain), if they are not already there (saves the copy time if it already exists). I keep 14 days' worth of backups on the Data Domain (as per policy). To accomplish this, I wrote another program to take a list of directories, and each directory can have a different number of days associated. The program looks at the directory and deletes any files that are older than the number of days given for that directory. That keeps the Data Domain cleaned up. If the policy of 14 days ever changes, all I have to do is change the number of days in the cleanup parameter file and it will extend the backups for those days.

    I don't use any compression, as everything I've read says that compressed or encrypted files do not dedupe well. I send all database backup files to the Data Domain, SQL Server, PostgreSQL, Oracle, MySQL, etc.

    I set the Scheduled Task to run under my Windows account, and the Data Domain has my user as the owner of the directories that I can write to.

    I would not recommend backing up directly to the Data Domain. Two reasons. 1) You are backing up across the network, which means that you are tied to the network speed, and if any hickups happen, you could loose the backup. 2) Not only are you throttled by the network speed, you are also throttled by the time it takes to do the dedupe, since Data Domain is a target, in-line dedupe. This means that it dedupes before it writes to any disks. This takes time, so it slows down the backup even more.

    Sometimes the backup across the network cannot be avoided, especially for situations where you don't have room on the server for the backup. Just know that you are paying both penalties if you do.

    I love the Data Domain (deduping part, don't care what the platform is, as long as it is in-line target dedupe). It lets me keep a lot of backups available any time I need them. I have had several occasions where it came in handy. All I have to do is copy the file back to the appropriate database server, and restore the database from those files. Works great if they need to see a database that has certain data before it was deleted or modified, I just restore it to a test database for their use, and drop it when they are done with it. Have also used it a couple of times for actual restores. Worked like a charm.

    I personally love the setup. Don't have to wait on tapes, and don't have to rely on anyone else.

  • vikingDBA,

    Can you comment on the time it takes to move your SQL backups from the local machine to the Data Domain box?

    I am currently backing up my SQL DBs to the local box with compression enabled. My largest DB is almost 5TB and compresses down to around 1.5TB. From everything i've read, to take advantage of the Dedupe on the Data Domain appliance, compression must be disabled.

    I have a bit of anxiety over how long it will take to push my 5TB uncompressed SQL backup over 1 gig ethernet to the Data domain box.

  • aferro:

    I would have a bit of anxiety over that, also! I do dozens of databases, the biggest is around 120 GB, and on down to 30 mb or so.

    One way to do more is to try to expand the pipe that you use to copy the data. If there was a way to use 10gbEthernet or direct fibre channel, then maybe you could quicken it up a bit. I'm not a storage guru, so can't help with the details there.

    I just use a straight Windows copy command from my program that I wrote to copy the files over one by one. Runs every hour so it grabs the latest tranlog backups.

    This would be problematic on anything approaching 500 GB, let alone anything bigger.

    My suggestion is to research opening up the pipe like I mentioned above.

  • Forgot to mention one detail. I copy from server to Data Domain, so the server's NIC is a GB NIC, and I'm slowed by the network link from those servers to the Data Domain location, which is 1/2 mile away at another building. That is definitely something to consider.

    I think mine is a fibre link, so I'm probably not losing that much, it is actually pretty quick.

    But, I would think I would have a tough time with a 1TB file!

    I mitigate my large backups by backing up to the owning server, then copying to the Data Domain. You could go directly to the Data Domain if it was so big you don't have the room to go first to the server. Going directly would then definitely be restricted by the network link speed.

    You could do Full backups once a week, with nightly incrementals, that would make it faster, unless a large percentage of the records are modified (directly, with page-splits, etc.).

  • We've implemented Data Domain backups for our SQL Servers when running in standalone mode. We are able to script it using xp_cmdshell in an agent job so that it mounts the disk, writes the backup, and then unmounts the disk.

    The original poster asked about using this with a SQL Server cluster, which it doesn't appear anyone answered. We are looking to do the same, and if anyone has had any experience, please let me know.

  • We've implemented the data domain with a good amount of success. At first, the SAN admin wanted to remove tlog and full backup scheduling from SQL Agent and use the EMC Networker software to control this. After some questions regarding the viability of the backups in the event of Networker issues, we decided that adding another layer of complexity to the backups wasnt ideal for us. We implemented the data domain by hanging a windows file share off of the appliance so that the SQL server would see if just like any other file share. Since we were already using Ola Hallengren's maintenance scripts to backup our databases, all we had to do was change the directory the script was pointing to.

  • cmarkowi (5/8/2013)


    The original poster asked about using this with a SQL Server cluster, which it doesn't appear anyone answered. We are looking to do the same, and if anyone has had any experience, please let me know.

    When I posted the original message, I didn't know much about Data Domain. Since that time, we have implemented Data Domain for all of our cluster and standalone backups. Initially, we had lots of vendor problems in getting security implemented using Active Directory. However, once we finally got past that (which took literally months), we've had to revert back to file share backups from time to time when Data Domain patches had issues. (It's a site standard to patch everything often and quickly.) However, Data Domain has been reliable for the last 6 months or so.

    Instead of using xp_cmdshell (which has security issues), we chose to use a PowerShell script (launched by Task Scheduler) which calls the TSQL "backup database" command using sqlcmd.exe. (By the way, we had issues with the PowerShell 2.0 invoke-command verb because we could not trap backup errors reliably using it. "$?" seems to trap errors from sqlcmd.exe reliably.)

    We have the same script scheduled on all nodes in the cluster, and the script checks to see if a given instance is running on that particular node before it backs it up. That way, we avoid backing up the same instance multiple times from Task Scheduler.

    We have over 20 instances writing their backups to Data Domain. Most instances execute a full backup once a week and hourly incremental backups throughout the week. We also have a script that tests the backups (via PowerShell, sqlcmd.exe, and TSQL "restore database") from each instance once a week. A backup set often has 80+ transaction logs, and we have not had any significant issues with restores. When we have an backup or restore issue, it is usually caused by a network glitch.

    In our experience, Windows file share performance is faster than Data Domain performance, which makes sense because file shares are not performing de-duplication, but Data Domain makes sense for our configuration.

    By the way, we use UNC (\\data_domain_server_name\share_name) in our SQL Server backups, rather than mapped drive letters. We had all sorts of headache trying to use mapped drive letters, but UNC fixed that fairly reliably.

  • We are looking at Data Domain to assist us in our growing disk space issues.

    I read this forum without seeing code provided by our contact (reference to nsrsqlsv.exe).

    I don't know if the configuration is different, but I'm trying to get something like this going:

    declare @sname varchar(25),@dname varchar(25),@stmt varchar(1000)

    select @sname='SERVER1',@dname='DATABASE1'

    select @stmt='nsrsqlsv.exe ' /*backup executable*/

    +'-s server.domain.net '/*networker server*/

    +'-c '+@sname+'.domain.net '/*client server*/

    +'-A '+@sname+'.domain.net '/*virtual-server, for clusters*/

    +'-l full '/*level*/

    +'-S 10 '/*number of stripes (threads) */

    +'-a "device interface=data domain" ' /*duplication node*/

    +'-b "ourPool" '/*backup pool*/

    +'-g "ourGroup" '/*group*/

    +'"MSSQL:'+@dname+'"'/*database name*/

    print @stmt

    exec xp_cmdshell @stmt

    I'll try to remember to repost when we get this working. We have a trial of NetWorker (they don't do that very often).

Viewing 15 posts - 1 through 15 (of 24 total)

You must be logged in to reply to this topic. Login to reply