Downtime

  • Comments posted to this topic are about the item Downtime

  • Really interesting & much needed point to spend the day thinking & debating upon.

    I have heard of the following approaches, but have never implemented them myself (the companies I work with always seem to be able to afford failover boxes).

    1. The application can be designed such that all data operations (inserts/updates) are staged on either the client machine (in a client-server application) and periodically flushed to the central server. If the central server goes offline, the user will be able to work locally.

    My opinion: I believe that this will work great for offline, write-intensive scenarios, e.g data entry.

    2. I have heard about people using database snapshots, where only write operations happen directly to the source database - all client reads are on the snapshot

    My opinion: If the source is down, the snapshot will be down. I do not see a benefit with respect to reducing downtime.

    3. For once-a-while write, majority read-only scenarios, I have heard about people using read-only databases (the entire database gets flushed on an update, that might happen once every 6 months or a year - eg. a copy of the USPS database)

    My opinion: This sounds promising, and as mentioned in the editorial, the database can be rebuilt, if it crashes out.

    Please note that these are only things I have heard of - I have never implemented these to reduce downtime nor have I ever studied them from this perspective.

    Sorry if I was not able to provide much help.

    Thanks & Regards,
    Nakul Vachhrajani.
    http://nakulvachhrajani.com

    Follow me on
    Twitter: @sqltwins

  • Nakul Vachhrajani (10/11/2010)


    1. The application can be designed such that all data operations (inserts/updates) are staged on either the client machine (in a client-server application) and periodically flushed to the central server. If the central server goes offline, the user will be able to work locally.

    My opinion: I believe that this will work great for offline, write-intensive scenarios, e.g data entry.

    we actually implemented our software like this and it works great.

    Now, the reason behind it was not to enable downtime but more as to enable continuous working while the network was down...

    however, it will come in handy if we need to take down the 'central database'

  • One of the things we offer with our product, an Electronic Health Record, is a service that will create PDFs of the most relevant data. It has all the patients' current meds, allergies and other current information dumped every night so that if something catastrophic does happen the client is still able to continue seeing patients. It's something entirely outside the DB and requires a workflow change but is a viable solution.

    During major upgrades we have clients that request that a read only copy of the pre-upgrade database remain available so all the data is still accessible while upgrading. One or two clients have an environment like this that is maintained that they can switch to should the primary go down.

    One thing that's important to remind users of is that despite whatever downtime is experienced it's frequently less than it would have been without the database. One of our clients had their building catch fire. Some of their groups used our product and some hadn't implemented it yet. The ones that did were contacting the patients later that day to get them to another site or reschedule as necessary. The ones that didn't were still trying to sort things out a couple weeks later.

  • I guess I've been fortunate, I guess, in that I have not worked with VLDB's where you need 4 hours or more to complete a restore, but I can see th issues that have to be dealt with in those cases. This is where the business needs to understand the cost to have a highly available database and be willing to take the hit when you experience issues, if they aren't willing to spend the money on clustering/mirroring/log shipping so that you CAN be back up and running with limited or no downtime.

    If I were to go back to my previous employer I'd definitely make the case that money needs to be spent to implement a mirroring or log-shipping solution for many of the SQL Server databases that I supported. Based on the reading I"ve done I'd go with mirroring because of ways it can be used to do upgrades/patches as well provide for HA.

  • Four hours of downtime isn't bad for a disaster recovery, and assuming it was completed off peak hours, and the users were notified ahead of time, then it seems like a bump in the road. Often times when a database repair needs to be peformed, the application is only partially crippled. For example, the order entry functuionality may still be functional, but deleted data or index corruption may prevent the running of daily reports, so it might make more sense to let the system limp along until close of business and then perform the dbcc repair, restore, or whatever needs doing. Also, I think that some DBAs may needlessly perform carte blanche re-building or defragging of indexes in response to performance issues without delving into the root cause or narrowing it down to a specific table. It shouldn't take a full four hours to restore or re-index one table.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Really interesting topic. We work with a 450Gb db and do many more writes than reads. Nodes that send data can queue several hours before being overwritten which is helpful to avoid data loss but not the uptime of the system. DB restore could be around 2 hours.

    Have implemented a/p cluster to allow sql services to stay live (as im sure many of you have) but this of course done not help if the db has failed/corrupted in some way. Mirroring the db doesnt help me ether as i cannot afford to mirror the a/p cluster ( i think this is how it has to be done ? )

    So we are left with splitting the db in to daily data collection (small db which is quick to restore say 10GB) and a large db (reports) which can be offline for longer. Not sure how we will implement the data transfer between the too DBs yet (ethert ongoing or overnight) but i think this is the only solution for us.

    Watching this discussion with interest.

  • scott_lotus (10/14/2010)


    Mirroring the db doesnt help me ether as i cannot afford to mirror the a/p cluster ( i think this is how it has to be done ? )

    I believe that this is inaccurate. The a/p cluster is just High-Availability for that one instance of SQL Server, regardless of how many databases it has. A database can be mirrored to another instance of SQL, but that instance should not need to be on another cluster. (You might want it to be, but I don't think that it has to be.)

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • WayneS (10/14/2010)


    scott_lotus (10/14/2010)


    Mirroring the db doesnt help me ether as i cannot afford to mirror the a/p cluster ( i think this is how it has to be done ? )

    I believe that this is inaccurate. The a/p cluster is just High-Availability for that one instance of SQL Server, regardless of how many databases it has. A database can be mirrored to another instance of SQL, but that instance should not need to be on another cluster. (You might want it to be, but I don't think that it has to be.)

    Think you maybe right there , maybe its because it was an A/P cluster , could not mirror to a passive node ?

  • You wouldn't want to mirror to the passive node since it's part of the same cluster. Any storage issue would wreck your system.

    This is slightly off topic. If you are looking for DR ideas for your cluster, please post a thread in that forum.

  • Downtime:

    - What happend to log shipped database which can be used to repair with data loss.

    - What happend to differential backups which can be restored somewhere and repair.

    Look, unexpected failures happen that is the nature of MS products. Applications designed using C# and .NET, are pretty much involved in bugs and so do SQL Server databases (that is why like "Steve Jones" pointed out to apply hotfixes more often).

    I bet, if you do run DBCC's every now and then or a SQL agent job on a Stand-By database with proper alerts in place, you will be able to capture the database problems.

    Thanks,

    Mahidhar Vattem

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply