Performing a piecemeal restore of a DB with Filestream data?

  • OK, I'm looking for ways to reduce the size of some client backups, and one option that sounds like it might help would be a "Files and filegroup" backup, but I have some concerns / questions on restoring the backups.

    Now, some info on the DB that would be backed up. The DB in question has Filestream data, which is what is pushing the size of the backup up.

    My thoughts would be the following:

    Monday: Full Database backup, Transaction Log backups throughout the day

    Tuesday - Friday: Backup the Primary Filegroup, Transaction Log backups throughout the day

    The offices in question are closed on the weekends.

    If I had to restore the data at some point, I would restore the most recent Primary filegroup backup, then the Monday Filestream from the full backup, and finish up by restoring transaction logs. I would expect that any Filestream data saved since the Monday backup would be lost, which would be acceptable to the client.

    My concern is, from reading this article on piecemeal restores, is that the Filestream data would be unrecoverable, at all in this situation, as I would be performing a point-in-time recovery (Up to the point the last transaction log backup was taken) If so, then this method would be of no use to me, as I would still need to backup the Filestream data every day.

    I realize we could take Differential backups throughout the week (Tue-Fri,) which would capture only the updated / changed / added Filestream information, but there are office politics reasons as to why this is not a viable solution.

    Thanks everyone,

    Jason A.

  • I may be wrong, but from what I was reading, your transaction log backups should contain updates to data stored in a filestream column. Filestreams are supported by all recovery models.

    It would be helpful if others could confirm this.

  • File Stream is basically as much of the data base as any of the files.

    SQL Server is very aware of this.

    I have only done some light testing, so I am not an expert in filestream.

    Size - are you using compression?

    Size and Speed - possibly look into a 3rd party tool?

    You are correct on tryiing to avoid 'piecemeal' processes.

    The problems are likely to show up when you really need a good recovery.

  • If you have a test or development server, test out different recovery senerios. See what works and what doesn't. That would be the best avenue to approach this problem. Plus, when you finish your testing you will know how the recovery process works without having to guess at what may happen.

  • The point in the article about partial (piecemeal) restores is that if you exclude a filestream file from the restore process, you can't restore to point-in-time anyway.

    There isn't an option to "lose some of the filestream data but restore the rest of the data". The database has to be consistent. You can either restore everything to a point-in-time, including the filestream, or you can restore partial files (piecemeal) and have NO access to the other files at all, and end up with potentially deferred transactions (which can be messy).

    Your restore plan has to include bringing the whole database back online, or has to leave whole filegroups offline, inaccessible, and defer transactions. There isn't a "halfway" option like what you're talking about.

    Filestream data is part of the database as far as SQL Server is concerned. If you want to leave it out of the database backups, don't use filestream at all. Have a table of file pointers (UNC paths is the usual solution for that), and leave the files in the file system. That will give you what you need in terms of partial restores being possible. You might end up with pointers that don't have a target-file, but there are ways to clean those up or use error-handling to deal with it during runtime.

    One of the main reasons to use filestream is to keep the binary data transactionally consistent with the database. That means in the database backups, and in the database restores. If you aren't going to use that, ignore filestream and simplify your whole process.

    The main use of partial restores is to get critical parts of the database online FAST while other parts are restored more slowly. Like an audit or archive dataset, which is likely very large and thus might take a long time to bring back online after a disk failure or similar situation. You can restore that over a longer period of time, while getting critical OLTP tables online fast from a much smaller backup file. You still get it all online, but you gain reduced downtime for customer-/user-facing data.

    Using it to reduce backup storage space is usually better accomplished by using a mix of Full and Diff backups. Diff backups are better for that than filegroup backups, usually.

    I think that's what you're trying to accomplish, from the description, so that's the direction I recommend aiming for. Possibly a mix of Full and Diff backups, and moving from Filestream to pointers.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • jasona.work (4/17/2012)


    I realize we could take Differential backups throughout the week (Tue-Fri,) which would capture only the updated / changed / added Filestream information, but there are office politics reasons as to why this is not a viable solution.

    This totally drives me crazy, when office politics interferes with good practice. Instead of accepting differential backups they would rather use a dubious solution and potentially lose data. Piecemeal backups are definitely not a solution for what you want, not if you want to avoid having a corrupt database.

    Hope the real issue is that they don't understand how differential backups work. I recommend a good sit down and explain that the two best options are compression (2008 Enterprise or 2008R2 Std) and differential backups, and how these work. Maybe a bit of knowledge will help.

    Failing that they need to begin archiving data.

    Cheers

    Leo

    Leo
    Nothing in life is ever so complicated that with a little work it can't be made more complicated.

  • Leo.Miller (4/17/2012)


    jasona.work (4/17/2012)


    I realize we could take Differential backups throughout the week (Tue-Fri,) which would capture only the updated / changed / added Filestream information, but there are office politics reasons as to why this is not a viable solution.

    This totally drives me crazy, when office politics interferes with good practice. Instead of accepting differential backups they would rather use a dubious solution and potentially lose data. Piecemeal backups are definitely not a solution for what you want, not if you want to avoid having a corrupt database.

    Hope the real issue is that they don't understand how differential backups work. I recommend a good sit down and explain that the two best options are compression (2008 Enterprise or 2008R2 Std) and differential backups, and how these work. Maybe a bit of knowledge will help.

    Failing that they need to begin archiving data.

    Cheers

    Leo

    Well, after having read the replies, I'm dropping the idea of doing filegroup backups, as we'd have to get the filestream as well, defeating the purpose.

    As for using backup compression, I'll have to check that that is enabled as that may help, although the data in the filestream is generally image files in JPG or TIFF, so they're already compressed.

    As for the politics, it's not so much that the bosses don't understand differential backups, as the company policy has always been "full backups are the only backups you can trust." We don't control the backups, we assist clients with the equipment in their offices (often remotely,) so there's always some concern that a client will take a backup, and just leave the media in the office, or lose the media, or damage it. So, I'm likely stuck with full backups...

    Thanks,

    Jason

  • jasona.work (4/18/2012)


    Leo.Miller (4/17/2012)


    jasona.work (4/17/2012)


    I realize we could take Differential backups throughout the week (Tue-Fri,) which would capture only the updated / changed / added Filestream information, but there are office politics reasons as to why this is not a viable solution.

    This totally drives me crazy, when office politics interferes with good practice. Instead of accepting differential backups they would rather use a dubious solution and potentially lose data. Piecemeal backups are definitely not a solution for what you want, not if you want to avoid having a corrupt database.

    Hope the real issue is that they don't understand how differential backups work. I recommend a good sit down and explain that the two best options are compression (2008 Enterprise or 2008R2 Std) and differential backups, and how these work. Maybe a bit of knowledge will help.

    Failing that they need to begin archiving data.

    Cheers

    Leo

    Well, after having read the replies, I'm dropping the idea of doing filegroup backups, as we'd have to get the filestream as well, defeating the purpose.

    As for using backup compression, I'll have to check that that is enabled as that may help, although the data in the filestream is generally image files in JPG or TIFF, so they're already compressed.

    As for the politics, it's not so much that the bosses don't understand differential backups, as the company policy has always been "full backups are the only backups you can trust." We don't control the backups, we assist clients with the equipment in their offices (often remotely,) so there's always some concern that a client will take a backup, and just leave the media in the office, or lose the media, or damage it. So, I'm likely stuck with full backups...

    Thanks,

    Jason

    Have I missed something here? Where is it stated that filestream data is NOT backed up in full, differential, or t-log backups? What I read said filestream is supported in all recovery models.

    Can someone point me to information that says otherwise?

  • Lynn, you are absolutely right that if we were to do Differential backups, it would catch the updated filestream information. I was talking about my original thought of doing File / Filegroup backups, and only backing up the Filestream portion once a week.

    I also like your idea of a test setup, so I can muck around with this more, I just need to find the time to set up the environment.

    Thanks,

    Jason

  • Whoever has the rule about Full backups only, needs to be educated on the point that they either need to modify that policy (and there's no good reason not to), or they need to cough up the cash for the extra disk space, or they need to define their data loss acceptance policy.

    If you aren't doing log backups, you can't do point-in-time data recovery, which means they have to define an explicit policy for losing data changes since the last full backup. They should have policies in place for all departments and customers who depend on that data, making it clear that their policies are designed to allow for data loss, and why that is.

    Not doing diff backups isn't quite that big a deal, it's just about saving costs on disk space, and saving time on backup processes. If they have a regular maintenance window which the backups can be done in, the fact that diff backups are usually faster and take less hardware resources (and cause less slowdowns in the database), then that won't matter much. If they want to pay for the extra drive space for more frequent full backups, that's fine too. And if they are happy with, for example, weekly full backups and hourly log backups, which will accomplish much the same thing, that's fine too.

    The only other real advantage to diff backups is that they sometimes allow for faster point-in-time restores. Not always, but sometimes. For example, a full backup from the weekend, plus a Friday diff, plus log backups up to the time of a crash on a Friday, will probably be faster than a weekend full plus a week's worth of log backups.

    If they don't want that, then they need to have a policy in place, and they need to accept the business costs of their policy. Simple as that.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Hi,

    I have a question. I have a Filestream database with 1.8To of data (6900 files - 20Go for filestream). The restore database will take 17 Hours.

    To improve the backup duration, I make a backup by groupfiles (see database structure below).

    Basically, is it possible to restore all Groupfiles without filestream (PRIMARY_DATA filegroup) and restore it in second step ?

    Functionnally, the developper team will desactivate the access to filestream to avoid any data corruption.

    Regards,

    Eric

    /****** Object: Database [STOCKAGE] Script Date: 03/20/2015 10:03:07 ******/

    CREATE DATABASE [STOCKAGE] ON PRIMARY

    ( NAME = N'STOCKAGE', FILENAME = N'Q:\Data\STOCKAGE.mdf' , SIZE = 70558720KB , MAXSIZE = UNLIMITED, FILEGROWTH = 524288KB ),

    FILEGROUP [EMIR_ACTIVE]

    ( NAME = N'STOCKAGE_EMIR_ACTIVE', FILENAME = N'Q:\Data\STOCKAGE_METADONE_EMIR_ACTIVE.ndf' , SIZE = 1024000KB , MAXSIZE = UNLIMITED, FILEGROWTH = 512000KB ),

    FILEGROUP [EMIR_ARCHIVE]

    ( NAME = N'STOCKAGE_EMIR_ARCHIVE', FILENAME = N'Q:\Data\STOCKAGE_METADONE_EMIR_ARCHIVE.ndf' , SIZE = 1024000KB , MAXSIZE = UNLIMITED, FILEGROWTH = 512000KB ),

    FILEGROUP [METADONE_ACTIVE]

    ( NAME = N'STOCKAGE_METADONE_ACTIVE', FILENAME = N'Q:\Data\STOCKAGE_METADONE_ACTIVE.ndf' , SIZE = 1327169536KB , MAXSIZE = UNLIMITED, FILEGROWTH = 524288KB ),

    FILEGROUP [METADONE_ARCHIVE]

    ( NAME = N'STOCKAGE_METADONE_ARCHIVE', FILENAME = N'Q:\Data\STOCKAGE_METADONE_ARCHIVE.ndf' , SIZE = 256683008KB , MAXSIZE = UNLIMITED, FILEGROWTH = 524288KB ),

    FILEGROUP [METADONE_STATIC]

    ( NAME = N'STOCKAGE_METADONE_STATIC', FILENAME = N'Q:\Data\STOCKAGE_METADONE_STATIC.ndf' , SIZE = 62167040KB , MAXSIZE = UNLIMITED, FILEGROWTH = 524288KB ),

    FILEGROUP [PRIMARY_DATA] CONTAINS FILESTREAM DEFAULT

    ( NAME = N'STOCKAGE_files', FILENAME = N'K:\FileStream\STOCKAGE_files' )

    LOG ON

    ( NAME = N'STOCKAGE_log', FILENAME = N'H:\LOG\STOCKAGE_log.ldf' , SIZE = 8424256KB , MAXSIZE = 2048GB , FILEGROWTH = 10%)

    GO

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply