Patching Clustered Instances

  • A question on cluster patching?

    I'm about to patch a SQL 2008R2 SP2CU11 instance to CU12. I start by patching the passive node and once complete I failover the instance to the patched node. On failover are the databases delayed from coming online due to any upgrade process being applied? If so is it a significant delay? (total databases are about 500GB)

    I just need to know if our application requires an extended outage for the SQL patching or if it falls within the paramaters of a normal failover outage.

    Any advice would be much appreciated,

    Steve

  • It may take little long compare to Mirroring. However check if any blocking or anyother issues on the server. Check for any error message for further indication.

    ---------------------------------------------------
    "Thare are only 10 types of people in the world:
    Those who understand binary, and those who don't."

  • It will take as long as it needs, have you thought of performing this first in a test\QA env to see what the results are?

    -----------------------------------------------------------------------------------------------------------

    "Ya can't make an omelette without breaking just a few eggs" 😉

  • it will run through the upgrade script process on first failover to a patched node, but that should be a matter of a few (2-3) minutes.

    Failover time will be more dependant on how your 500GB worth of data is made up, how many databases and the number of VLFs their transaction logs are made up of.

    the other question is do you really need the CU?

    ---------------------------------------------------------------------

  • Our organisations policy is to apply all CUs so it's not something I can change. We did apply this in our test environment some weeks ago but no one took note of how long the script update state took to clear, at that stage there was no requirement to keep the online applications online. Our business colleagues in the mean time had decided to have a marketing campaign necessitating the apps to remain online.

    My understanding was the script upgrade state would be for a few minutes but there are some in our team who are stating it may take hours. As it turns out the business have decided to delay the campaign until after the patching. I'll make a note of the timesit takes to perform the upgrade.

    On the same note though I've heard of sites that stagger patching their cluster nodes over days or weeks with no problems. I was wondering how does the system cope if you have to fail back to a earlier patch level?

    Many thanks for you comments so far.

    Steve

  • SQLSvrStevo (6/13/2014)


    We did apply this in our test environment some weeks ago but no one took note of how long the script update state

    Check the sql server log there may still be some detail in there with timings. You could uninstall the CU then re apply and note the timings

    -----------------------------------------------------------------------------------------------------------

    "Ya can't make an omelette without breaking just a few eggs" 😉

  • I have only ever seen the script upgrade phase take minutes, never hours, and I've done a few.

    We have 'stagger patched' clusters as well, theoretically the two nodes could run at different patch levels indefinitely.

    You would not be able to fail back to an unpatched node, which is why they are removed from the list of possible owners.

    see this KB if you have not already

    http://support.microsoft.com/kb/958734

    ---------------------------------------------------------------------

  • Thanks again for the comments.

    We actually proceeded with the patching yesterday and decided to have a full outage.

    This is a list of the actual faiback times experienced when failong over to the patched node (SQL2008R2 SP2 CU11 to CU12): the first instance took 2 minutes

    the second instance took 3 minutes 10 seconds

    the third instance took 3 minutes

    and the fourth instance took 1 min 30 seconds

    Of this approx 30 seconds was the failover process itself.

    Total impact to online systems is one outage of 30 seconds and a second outage of about 3 minutes about 3/4 hour later.

    Regards,

    Steve

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply