Downtime

Question

Downtime

Steve Jones - SSC Editor

SSC Guru

Points: 734444
More actions
October 9, 2010 at 11:03 am

#140999

Comments posted to this topic are about the item Downtime

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply

Nakul Vachhrajani SSChampion Points: 10283 More actions · Answer 1

Really interesting & much needed point to spend the day thinking & debating upon.

I have heard of the following approaches, but have never implemented them myself (the companies I work with always seem to be able to afford failover boxes).

1. The application can be designed such that all data operations (inserts/updates) are staged on either the client machine (in a client-server application) and periodically flushed to the central server. If the central server goes offline, the user will be able to work locally.

My opinion: I believe that this will work great for offline, write-intensive scenarios, e.g data entry.

2. I have heard about people using database snapshots, where only write operations happen directly to the source database - all client reads are on the snapshot

My opinion: If the source is down, the snapshot will be down. I do not see a benefit with respect to reducing downtime.

3. For once-a-while write, majority read-only scenarios, I have heard about people using read-only databases (the entire database gets flushed on an update, that might happen once every 6 months or a year - eg. a copy of the USPS database)

My opinion: This sounds promising, and as mentioned in the editorial, the database can be rebuilt, if it crashes out.

Please note that these are only things I have heard of - I have never implemented these to reduce downtime nor have I ever studied them from this perspective.

Sorry if I was not able to provide much help.

Thanks & Regards,
Nakul Vachhrajani.
http://nakulvachhrajani.com

Follow me on
Twitter: @sqltwins

kennethje SSC Enthusiast Points: 117 More actions · Answer 2

Nakul Vachhrajani (10/11/2010)
1. The application can be designed such that all data operations (inserts/updates) are staged on either the client machine (in a client-server application) and periodically flushed to the central server. If the central server goes offline, the user will be able to work locally.
My opinion: I believe that this will work great for offline, write-intensive scenarios, e.g data entry.

we actually implemented our software like this and it works great.

Now, the reason behind it was not to enable downtime but more as to enable continuous working while the network was down...

however, it will come in handy if we need to take down the 'central database'

chrisfradenburg SSCrazy Eights Points: 9592 More actions · Answer 3

One of the things we offer with our product, an Electronic Health Record, is a service that will create PDFs of the most relevant data. It has all the patients' current meds, allergies and other current information dumped every night so that if something catastrophic does happen the client is still able to continue seeing patients. It's something entirely outside the DB and requires a workflow change but is a viable solution.

During major upgrades we have clients that request that a read only copy of the pre-upgrade database remain available so all the data is still accessible while upgrading. One or two clients have an environment like this that is maintained that they can switch to should the primary go down.

One thing that's important to remind users of is that despite whatever downtime is experienced it's frequently less than it would have been without the database. One of our clients had their building catch fire. Some of their groups used our product and some hadn't implemented it yet. The ones that did were contacting the patients later that day to get them to another site or reschedule as necessary. The ones that didn't were still trying to sort things out a couple weeks later.

Jack Corbett SSC Guru Points: 184393 More actions · Answer 4

I guess I've been fortunate, I guess, in that I have not worked with VLDB's where you need 4 hours or more to complete a restore, but I can see th issues that have to be dealt with in those cases. This is where the business needs to understand the cost to have a highly available database and be willing to take the hit when you experience issues, if they aren't willing to spend the money on clustering/mirroring/log shipping so that you CAN be back up and running with limited or no downtime.

If I were to go back to my previous employer I'd definitely make the case that money needs to be spent to implement a mirroring or log-shipping solution for many of the SQL Server databases that I supported. Based on the reading I"ve done I'd go with mirroring because of ways it can be used to do upgrades/patches as well provide for HA.

Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question

Eric M Russell SSC Guru Points: 125521 More actions · Answer 5

Four hours of downtime isn't bad for a disaster recovery, and assuming it was completed off peak hours, and the users were notified ahead of time, then it seems like a bump in the road. Often times when a database repair needs to be peformed, the application is only partially crippled. For example, the order entry functuionality may still be functional, but deleted data or index corruption may prevent the running of daily reports, so it might make more sense to let the system limp along until close of business and then perform the dbcc repair, restore, or whatever needs doing. Also, I think that some DBAs may needlessly perform carte blanche re-building or defragging of indexes in response to performance issues without delving into the root cause or narrowing it down to a specific table. It shouldn't take a full four hours to restore or re-index one table.

"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

scott_lotus SSCommitted Points: 1504 More actions · Answer 6

Really interesting topic. We work with a 450Gb db and do many more writes than reads. Nodes that send data can queue several hours before being overwritten which is helpful to avoid data loss but not the uptime of the system. DB restore could be around 2 hours.

Have implemented a/p cluster to allow sql services to stay live (as im sure many of you have) but this of course done not help if the db has failed/corrupted in some way. Mirroring the db doesnt help me ether as i cannot afford to mirror the a/p cluster ( i think this is how it has to be done ? )

So we are left with splitting the db in to daily data collection (small db which is quick to restore say 10GB) and a large db (reports) which can be offline for longer. Not sure how we will implement the data transfer between the too DBs yet (ethert ongoing or overnight) but i think this is the only solution for us.

Watching this discussion with interest.

WayneS SSC Guru Points: 95409 More actions · Answer 7

scott_lotus (10/14/2010)
Mirroring the db doesnt help me ether as i cannot afford to mirror the a/p cluster ( i think this is how it has to be done ? )

I believe that this is inaccurate. The a/p cluster is just High-Availability for that one instance of SQL Server, regardless of how many databases it has. A database can be mirrored to another instance of SQL, but that instance should not need to be on another cluster. (You might want it to be, but I don't think that it has to be.)

Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes

If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!

Links:
For better assistance in answering your questions
Performance Problems
Common date/time routines
Understanding and Using APPLY Part 1 & Part 2

scott_lotus SSCommitted Points: 1504 More actions · Answer 8

WayneS (10/14/2010)
scott_lotus (10/14/2010)
Mirroring the db doesnt help me ether as i cannot afford to mirror the a/p cluster ( i think this is how it has to be done ? )
I believe that this is inaccurate. The a/p cluster is just High-Availability for that one instance of SQL Server, regardless of how many databases it has. A database can be mirrored to another instance of SQL, but that instance should not need to be on another cluster. (You might want it to be, but I don't think that it has to be.)

Think you maybe right there , maybe its because it was an A/P cluster , could not mirror to a passive node ?

Steve Jones - SSC Editor SSC Guru Points: 734444 More actions · Answer 9

You wouldn't want to mirror to the passive node since it's part of the same cluster. Any storage issue would wreck your system.

This is slightly off topic. If you are looking for DR ideas for your cluster, please post a thread in that forum.

Hotfix SSC-Addicted Points: 449 More actions · Answer 10

Downtime:

- What happend to log shipped database which can be used to repair with data loss.

- What happend to differential backups which can be restored somewhere and repair.

Look, unexpected failures happen that is the nature of MS products. Applications designed using C# and .NET, are pretty much involved in bugs and so do SQL Server databases (that is why like "Steve Jones" pointed out to apply hotfixes more often).

I bet, if you do run DBCC's every now and then or a SQL agent job on a Stand-By database with proper alerts in place, you will be able to capture the database problems.

Thanks,

Mahidhar Vattem