Netapp snap for files, log shipping for database?

  • One of our top challenges is to create a disaster recovery site so we can recover from an outage in under 30 minutes. Our database is now about 2 TB but has a dependency on a file share containing over 2 TB of scanned images. A full backup with native sql is running about 3.5 hours and that will grow. We also do nightly differentials and every 15 minute log backups.

    One suggestion was to institute log shipping ( or mirroring ) for the database and Netapp Snap technology for the images since our disk back end is Netapp.

    Also under consideration is training DBAs to handle Netapp's Snap Manager for SQL server, but the training suggested is long and costly, requiring DBAs to learn quite a bit of storage engineer information.

    The advantage of staying with "native sql" techniques for the DBAs is that we have several experienced ones and this keeps DR within their skill sets as well as within the skill set of any new ones that might be hired.

    Our Systems guys are learning how to handle the storage side of Netapp and they do not want to learn anything about databases.

    The other, original, proposal was for a couple of DBAs learn the Netapp approach to backups, restores etc. This leaves you spending quite a bit for a handful of DBAs to get training, and exposure if they leave the company. It seems you either "buy in" to a particular storage technology 100% for the long term, or not.

  • Instead of using Netapp and Snap - you want to implement replication across to the DR site for your image files. This way, all image files are immediately written to the DR site. You would also set it up as bi-directional replication so that any files written on the DR site are replicated across to your production site.

    For the databases, the options I would pursue are SQL Server 2012 Always On clustering. If that is not available and you cannot upgrade yet, then straight database mirroring. If you have to rely on Netapp (or any SAN technology), then you should be looking at the mirroring capabilities on the SAN. With Netapp, you can setup the LUNs supporting the databases with a mirror that can be replicated to your DR site (note: you still have to copy that data across the first time - and copying a compressed backup file for database mirroring will take less time).

    Netapp's (or any SAN's) snap technology is not going to be frequent enough to prevent data loss. Additionally, you have to be concerned about the timing of the database backups and the snap clone - or you could end up with the database having entries referring to image files that do not exist, or image files that do not have an entry in the database.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • We're currently on a sql 2005 active/passive cluster and will probably be upgrading straight to 2012 eventually ( next year ). I don't fully understand Netapp snap technology yet, but what I'm hoping management will adopt is a combination of snap for non-database files and something more inherently Sql Server ( dba world ) for the database.

    The flex clone idea where you can create a copy of the production database on a dev box with no disk footprint is appealing, but due to bandwidth limitations from office ( DR location ) to server facility I'm not sure that would work now. Of course that has implications for log shipping too but IO for that should be light by comparison.

    I think next year the DR mirror site will be in another state so the bandwidth for that shouldn't be an issue then.

  • My impression was that backups done via Netapp's snap manager for sql could be done very often, very fast and that for databases in full recovery, its an either or proposition -- you can't back these databases up with native sql because you'll break the netapp log chain.

    One thing we haven't tested is the IO freeze time. One of our data files in in line to be split up, but currently that one is 1.1TB. You also have to make sure backups run with sql snap manager don't step on eachother -- can't have a log backup overlap a full backup etc.

    Add in the possibility that the database might be migrated off of Netapp in a year or two and its quite an investment to go that route now.

  • Indianrock (8/15/2012)


    My impression was that backups done via Netapp's snap manager for sql could be done very often, very fast and that for databases in full recovery, its an either or proposition -- you can't back these databases up with native sql because you'll break the netapp log chain.

    One thing we haven't tested is the IO freeze time. One of our data files in in line to be split up, but currently that one is 1.1TB. You also have to make sure backups run with sql snap manager don't step on eachother -- can't have a log backup overlap a full backup etc.

    Add in the possibility that the database might be migrated off of Netapp in a year or two and its quite an investment to go that route now.

    I don't understand how this relates to your original question. However, you can use native backups with no problem as long as you use the COPY_ONLY parameter.

    The reason Snap Manager backups are so fast, is because they are not 'backing' up the database. They are taking a snapshot of the LUN where the database files are located. This is why you cannot put the system databases on the same LUN as the data files - and why they also recommend having a separate LUN for the data file.

    If you have multiple data files on a single LUN - then all of those databases are 'backed' up at the same time. This causes issues if you want to be able to restore one of the databases - because then you cannot restore the LUN to the snapshot and instead have to present the LUN as a new LUN and use native restore on the single database.

    How this all relates to using the Snap backups for DR is simple. You cannot do that without copying the full database across the wire - every time. You cannot copy the snapshot - because the snapshot has to be able to read the data from the source system.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • Unless SnapManager's got much more complicated in the last 2 years, the training really shouldn't be that long. Or costly.

    The front end was god-awful, but the technology's ace.

    YMMV, but 700GB used to back up in well under a minute. As Jeffrey says - it's quick because it's not really backing up as such, just writing meta data.

    The log backups are more or less just native SQL log backups, although IIRC they're not compatible with native SQL backups.

    Having said all that, I'll echo Jeffrey - SnapManager's great for on-site backup/restore. Off-site DR, not so much.

    SAN replication/mirroring and/or SQL's own HA solutions are much more suitable.

  • What exactly did you mean by

    "Having said all that, I'll echo Jeffrey - SnapManager's great for on-site backup/restore. Off-site DR, not so much. ". ?

  • for the images you may be better off with Windows Distributed File System. This is much improved under Windows 2008

    -----------------------------------------------------------------------------------------------------------

    "Ya can't make an omelette without breaking just a few eggs" 😉

  • Indianrock (8/16/2012)


    What exactly did you mean by

    "Having said all that, I'll echo Jeffrey - SnapManager's great for on-site backup/restore. Off-site DR, not so much. ". ?

    Just that, as SnapManager for SQL doesn't create actual backup files for the data files, you can't move them to another site for DR. Another technology is required for that.

    May have conflated the two issues (DR site & SQL backups) from your posts.

  • Well I'm trying to get enough information about the best approach to backups, restores and disaster recovery without having to take $10,000 in courses to make the decision. Also trying to get to the truth behind lots of marketing hype.

    I did read that the Netapp approach makes getting copies ( "backups" "clones" whatever you want to call it ) of the database available on development servers with flex cloning. And supposedly the mirror/data files/snap remains on the Netapp with the database on the DEV sql server using almost no disk space on the DEV server. That would be helpful and save lots of time copying big sql backups around the network.

    But from what's been said in this thread it sounds like for keeping a DR hot standby current we might be better off with log shipping, sql server mirroring or, once on sql 2012, availability groups. I never got the impression from Netapp reps that a DR copy couldn't be kept very current with the snap manager approach -- taking "snap" backups every 15 minutes if necessary. They did caution that you can't have two such "backups" happening at the same time -- no overlap between "log" and "database" backups.

    Obviously its tough for a DBA without much storage knowledge to get his/her head around this.

    I had come to the conclusion that once a database reaches a certain size, "native" sql stuff just wouldn't hack it. I will look into the Windows Distributed File System approach.

    thanks

    Randy

  • $10,000? Crikey! I'll do it for $4k 😀

    Indianrock (8/17/2012)


    I did read that the Netapp approach makes getting copies ( "backups" "clones" whatever you want to call it ) of the database available on development servers with flex cloning. And supposedly the mirror/data files/snap remains on the Netapp with the database on the DEV sql server using almost no disk space on the DEV server. That would be helpful and save lots of time copying big sql backups around the network.

    Don't them of them as distinct backups, copies or clones. They're not separate from the original data. Or, not at first.

    As long as the data in the original/clone/backup stays the same, they're the exact same block on the underlying disks.

    Only when the data changes are new blocks written - and then only for the blocks that contain changes.

    NetApp operations are super quick because all they're doing is updating the meta data - i.e. which blocks each logical disk contains.

  • That makes it sound like DEV database IO would be affecting prod database IO

  • Indianrock (8/17/2012)


    That makes it sound like DEV database IO would be affecting prod database IO

    To a point, yes. But I'd be impressed if you were hitting the dev server hard enough to see any affect on prod.

  • Well most of our dev servers have a full copy of the production database, all 2 TB of it -- poor practice I know but it is what it is for now. And the testing there mimics what runs in production although it is sporadic.

    I just posted a question on the Netapp Community forums which really gets at the questions most on my mind. I'm going to paste that in here and move it to a new thread if that is recommended:

    "Learn Netapp software or use native sql server toolsWe have a very basic decision to make and one choice may not be very popular on this forum, but I have to ask. Our main production database uses a Netapp filer -- whether it will still be on Netapp in 2 years is unknown. Two of us are being asked to learn to use snap manager and snap manager for sql server. As far as I know, no other systems technician or DBA will be learning these tools, which leaves backups and disaster recovery in just two hands ---- two hands that could leave the company at any time.

    One alternative, which is probably more popular with our large DBA team, would be to use something native to sql server like log shipping or mirroing to keep a hot standby copy of the database current, along with Windows Server 2008 DFS replication to keep our large scanned image share current on the disaster recovery filer. ( http://technet.microsoft.com/en-us/library/cc771058.aspx ).

    The training proposed by Netapp would be via a company called Unitek and the SMSQL course is 40 hours for about $4500, but pre-requisite classes might be needed. It seems that to make a large investment and have database staff take a large step towards becoming storage experts is a big decision. I think a company pretty much needs to decide that Netapp filers and software will be a large part of their mix for many years before heading that way. It just isn't clear that's our future here.

    Aside from that is the whole personal career equation -- what kind of skills can you take with you since nobody stays with a company for 30 years anymore."

  • NetApp snapmanager (which incidentally I am not keen on at all) is implemented at the storage level. Native SQL backups or backups using a 3rd party tool such as Redgate sqlbackup are implemented at the application server level.

    IMHO native or 3rd party backups offer a finer degree of recovery and are more flexible.

    -----------------------------------------------------------------------------------------------------------

    "Ya can't make an omelette without breaking just a few eggs" 😉

Viewing 15 posts - 1 through 15 (of 18 total)

You must be logged in to reply to this topic. Login to reply