Are You Always Up?

  • Comments posted to this topic are about the item Are You Always Up?

  • I work for a mid-sized financial services company that includes two regional banks. Over my 14 years there, as we've grown we've gone from a fair amount of latitude in taking down systems to highly restricted outage windows, even when everything is redundant & load balanced. It's certainly more challenging and cuts into personal time much more.

  • ryk 98103 - Friday, January 26, 2018 9:04 AM

    I work for a mid-sized financial services company that includes two regional banks. Over my 14 years there, as we've grown we've gone from a fair amount of latitude in taking down systems to highly restricted outage windows, even when everything is redundant & load balanced. It's certainly more challenging and cuts into personal time much more.

    I hate that restrictions on uptime and maintenance cut into personal time. I think it's a sign of a good employer when they balance that with other time off.

  • Even if the server and database are "up", the system as a whole isn't entirely "available" within the context of every process or case usage. I mean, you can have a technically successful fail-over and still not have all the storage mappings, logins, and OS dependencies in place and working. Some users will say it's available and some will say not. Likewise, unacceptable latency can exist for specific processes and users even when the server overall is running in top form.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • I would say those items are often out of my control. I don't worry about the network with regards to my commitment for uptime or issues with maintenance. Same for the app servers, firewalls, etc.

    This is more a view of  what are your restrictions for maintenance and SLA for making the system available.

  • I've always worked at 24/7 companies.  That said when I worked at a small/medium sized company I was the only person who could support some of the critical systems *cough* database/crm/data warehouse so i was on call 24/7/365.  Yes they had a maintenance window, for 3 hours on sunday night, which required me to be up to monitor.....  The flip side is i got to groom myself like a wild, as opposed to city, hobo and no one complained.  Now working at a large company the hours of operation are technically 9-5 but processes run all night, the flip side is there's other people so we actually have an on call rotation and can turn our phones off.  I much prefer the latter even if it's a little more rigid in other areas.

  • Our lab runs 6 days a week and everything needs to be available from very early morning until late in the evening. We usually do SQL Server patching on Sundays. Networking will do OS patching Saturday evenings or Sunday evening depending on their workload. They will alternate between non-production systems one week and production the following week. This gives them a little extra time to look for any gotchas before the updates hit production. There is a little more latitude for releasing application updates since the users of the main applications are finished by 8:00pm. We still usually do them on the weekend.

    -Tom

  • Always. Every place that I have worked in my career has been a 24/7/365 shop.  Smaller to midsize companies where I was the Help Desk Supervisor, and the Junior DBA, as well as the second network admin and primary phone system support person.  So even if I had a weekend or day off from the on-call rotation, I would still get called.  Thankfully then it was most often my own team calling so we could resolve the issue quickly as they had already done the triage for the situation.

  • I'm always around. Luckily, being I work in analytics and data science, most of my stuff shuts down after 5PM my time. But, I've always conveyed to my boss that I am around if things break no matter the time or day.

    In the past, when I worked in the video game industry, we developed massively multiplayer online games. Hundreds if not thousands of players playing on the same server around the world. While I was not specifically operations, I had to be available around the clock if the service goes down or something breaks where we had to do emergency patches in the AM.

    I can't count the amount of times issues struck in the wee hours. We managed this by having the typical on-call schedules and rotated people around in our team to wake up, call the IT guys (or send them cabs at whatever bar they were in that night haha), wake up the programmers (if needed), and then notify the customers that hey, we are up and on the case. Nothing brings me more joy than calling the sys admins at 3AM their time and waking them up.

    Besides that, patching or major launches to the game usually meant 24 hours of work. I have slept at my desk and have covered people up with blankets at the office. Believe it or not, it was actually pretty fun for most of us. We really enjoyed being there and the business was really cool with covering us while there.

    The wildest story thus far is me pulling like 15 hours and checking out super late. I drove home around 3AM and was pulled by a cop in my city for my tail light being out. I pulled over to the side of the road and turned off my car. The police officer pulled behind me and forgot to put the car in park as she tried to exit. The police car started to roll forward. She tried to get back in the car and was dragged forward slamming right into my car knocking out the other tail light.

    Her boss had to come and write up the report while her co-workers poked fun. I never got a ticket and her boss said this was her first night out alone. She was likely nervous.

  • I should share. I've been lucky to usually not be on call except for something unusual, but I've had a few good ones.

    Working at the nuclear power plant, first year out of school, we had a mandatory deployment of a new radiation tracking system. Since we were bound by federal regulations, they took effect Jan 1. The actual Jan 1 at midnight. I went to work around 5pm, Dec 31. I went home around 10pm Jan 1. Then worked around 110 hours that week to support an unstable system. Eventually they had to roll back to paper tracking while we struggled with developers that hadn't load tested servers. Got to work 400+ hours that month, roughly going 12on/12off.

    The time I was on call was a rotating week for our team of 20 Ops people. I had to cover all Windows, SQL, Exchange, etc. stuff with others on secondary call if I couldn't figure things out. We had to respond to 4 or 5 pages for free, then got paid by the call. I ended up with 30+ pages as we'd just brought on some remote workers on the other side of the world. My first week made some $$, but I slept with an old-style pager on my chest so it wouldn't wake my wife. Needless to say I started to avoid on call after that.

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply