Deduplication Technology & SQL Backups

  • I have just been introduced to the topic of Deduplication backup technology. Now we're being asked to apply it to SQL Server backups.

    Does anyone have any experience in using this technology? Has anyone had any issues with it being used in conjunction to SQL Server backups?

    My (limited) understanding of the technology is that it does a checksum on the data blocks and sees if it's previously backed up that data block. If so, it adds a pointer, but doesn't actually back up the "new" block since it has the old block. I should add that this technology basically backs up regular files and backup files to a hard drive on another machine (remote site backup technology).

    My concern is that this technology will look at two SQL Server backup files and only partially backup the second day's file because part of it looks like data the technology has already backed up. So if that previous backup gets deleted out of the queue, we won't have a full SQL Server backup to go back to.

    Since I haven't been exposed to this technology before, though, I'm willing to admit I have no idea how it interacts with SQL Server backup files. I'd love to hear from you if you know what I'm blabbering about.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • We were looking at such technology ourself, didn't buy it as it was too expensive and we are looking at 11 million to 16 million dollars in budget cuts for next year.

    My understanding is as long as a file references a given block of data, that block of data will not be deleted just because the first file to write it to disk is deleted. Basically, as long as there is a pointer to the data, it is retained and will provide you with a complete file.

  • We are currently implementing this technology.

    Lynn is correct about the pointers. However, with backups, those pointers do expire after a period of time.

    The technology is to be implemented in backups and on our SAN. This projects to save us ~70% of our disk space.

    With the backups, there are two methods that can be employed. Either agent or agentless for SQL Server. I am implementing the agentless method. We will allow SQL Server to perform the backups and then perform a filesystem backup of the bak files. I talked to a few other shops that implemented this technology first. None of them use the SQL agent - and also perform a file level backup of the bak files instead.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • Good to know that the pointers expire after a period of time. Is this configurable in your environment? I think I'd have some issues with blocks disappearing while a backup file is still valid.

    I can tell you that during the dog and pony show, this never came up. Should our budget woes go away and we revisit this option, I have something more to ask about.

  • Backup expirations are configurable. The default is 30 days for files and 14 days for SQL Backups. Since we are going with the file backup version, we were initially going to configure it for 30 days. Now we are changing plans to include weekly, monthly and yearly backups.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • Wow, the pointer info is good information to have.

    I do know that one of the options is to use the Agent version of the backup. The other is to use Litespeed without compression (which negates the reason we got Litespeed in the first place) and do the Agentless thing.

    I think we're going to have to test this whole thing. And since the default is 14 days for SQL Backups, we should be able to test expired pointers before the changeover deadline.

    Thanks. If anyone comes up with additional information, or you discover any problems, let me know. We've got a few months before we implement, so I'll be keeping an ear out.

    Lynn, the reason our workplace is implementing is because the storage cost savings is cheaper with using this technology, which more than makes up for the cost of the technology itself. Imagine only having to store one copy of a word document that fifty people keep on different file shares, but is basically the same word document. Then multiply that by the # of people in your organization and the space of all the documents you have to back up every night. Apparently, there's a significant amount of money that's chewed up by all of this. But since I'm not on the storage teams, I don't know what the real numbers are.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • They were looking at 185,000 for a dedup backup system, purchase a 16 TB iSCSI San for for 35,000 for storing backups before dumping to tape.

    Like I said, maybe when funding for K12 increases instead of getting cut.

  • You're welcome Brandie.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • Lynn Pettis (3/17/2010)


    They were looking at 185,000 for a dedup backup system, purchase a 16 TB iSCSI San for for 35,000 for storing backups before dumping to tape.

    Like I said, maybe when funding for K12 increases instead of getting cut.

    Pricing seems just about the same as what we were quoted (no idea on the final numbers though - not my dept.).

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • Seems to be more cost effective for us at the moment to go with the 16 TB iSCSI sans for now. If we need more space, buy another one.

    I'd love to work with one myself.

  • Lynn Pettis (3/17/2010)


    Seems to be more cost effective for us at the moment to go with the 16 TB iSCSI sans for now. If we need more space, buy another one.

    I'd love to work with one myself.

    We considered alternatives. Then we evaluated the intangible costs and came to the consensus that it was overall cheaper to use the dedup backup.

    The filesystem and exchange backups here would consume an entire weekend just to perform a full backup.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • Well, like I said earlier, facing a 11 - 16 Million dollar cut in our budget for next fiscal year, and the follwoing year we may see more cuts. We had to look at how we spennt the money we did have available. Easier to justify 35,000 rather than 185,000.

    One of the problems working in the public (K12 Education) sector, dependent on shrinking tax dollars.

  • Absolutely - different for different sectors.

    Shrinking education budget

    Increased Taxes

    Sound like a bad trend?

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

Viewing 13 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply