Big Power Issues

  • Comments posted to this topic are about the item Big Power Issues

  • Boy, what a great editorial.  Your point about the British Airway press releases and statements being vetted by lawyers sure was on point.  In fact, that's what I think is a part of the problem.  I listened to the news on TV about the BA failure being about a power supply failure and remarked to my wife that it sounded like a complete crock.  She asked why and I said that it must have been misinformation or legalese because a system of that size would certainly have failover systems available.

    I look at the recent OneLogin hack in much the same way.  They're publishing one set of information in public statements, but different information to certain customers.  They're using statements to hedge their bets like "cannot conclusively rule out the possibility" and "may have been able to be accessed".  I know I personally use at least 3 systems that were impacted by the hack, but I don't know how many more because I wasn't contacted about anything and I've not found an article yet with an all-inclusive list.

    No company who's online is completely safe from being hacked.  We design the safest systems we can, but unless we keep up with technology (SSL being broken, for example), we're all vulnerable.  If it does happen, companies need to be honest and transparent about it, instead of resorting to the CYA position of legalese, misinformation and no information.  It does nothing to prevent further damage or restore trust in the company, which is precisely what they need to do.

  • Similar - If I had to guess I would suspect a large now ageing infrastructure back end originally well designed some time ago by competent intelligent people that have since moved, been moved on or retired leaving no staff with a deep understanding of a complicated configuration and structure or with little influence to really direct investment. A system well designed enough that those darker misunderstood areas were able to continue operation for years without maintenance until such time as something "happened". Maybe even a power cut.

    I have sympathy Google and Microsoft have the luxury of being software companies first . Management of traditional companies I think need to understand that sector leaders are now applied software developers first. Get that right and you will proactively avoid problems rather than re-actively fixing errors.

  • The British Airways outage was as frustrating for IT professionals as it was to customers (some of whom were IT professionals).
    When these big disasters occur there is an opportunity to learn from them but only if the causes are captured as a learning piece.  I suspect that the BA outage is down to something embarrassingly easy to avoid and is something that was warned about and ignored over and over again.

  • Ed Wagner - Saturday, June 3, 2017 2:56 PM

    No company who's online is completely safe from being hacked.  We design the safest systems we can, but unless we keep up with technology (SSL being broken, for example), we're all vulnerable.  If it does happen, companies need to be honest and transparent about it, instead of resorting to the CYA position of legalese, misinformation and no information.  It does nothing to prevent further damage or restore trust in the company, which is precisely what they need to do.

    Yes and no. I assume you mean by "we", you refer to your company. I'm sure you try to design safe systems, but is it the safest it can be based on technology, or on the time you allot to build/fix something? That's an compromise many of us make, and usually we're secure enough if we aren't tested too extensively.

    Keeping up with technology is important. I think something like 80-90% of hacks through technology came through libraries and older technology that hadn't been patched by the company using it. There were patches, but they were years late in applying them. As much as I dislike the Windows patches, I do understand why this is better for us all to have our systems patched regularly.

  • At the retail bank I worked for, we had a special team called the 'limit test team' whose job was just to set up a system before it was deployed to production, and then would randomly break stuff to ensure that the system degraded gracefully and could be restored quickly.  They did all sorts of evil things and we learned a lot about HA and DR as a result. I thought it a bit harsh at the time, especially that trick with the hard drive and the screwdriver. 
    On another occasion with, I seem to remember, another company, a passenger plane crashed, nearly missing a key offsite standby data centre. We were forced to enact all our DR routines on the assumption that the plane had entirely destroyed the data centre. It was quite an eye-opener and we soon opened another facility in a very distant location.
    I have known a power surge up to 400v upset a UPS so much it fried all the downstream equipment connected to it, but that was a long time ago and the current breed of UPS should be OK with this sort of incident. I suspect that the management of BA haven't even been told what really happened. There are management layers busily trying to deflect blame onto suppliers, and less interested, at this stage, on a forensic analysis of the incident.

    Best wishes,
    Phil Factor

  • I hope that you are right, Steve. I'd like to see more companies share what went wrong, when it does go wrong. But I'm also mindful of the fact that such disclosure would make someone look pretty stupid (at least), or incompetent (worse) or downright willfully negligent (really bad). Just being honest here, I know I wouldn't want to appear incompetent. Also such disclosure may likely result in at least serious disciplinary action or worse. I work in the public sector, for a large state agency. To my knowledge we've never had anything like that happen, thank God. But I can only imagine what would happen if something like this were to happen to us. What with the public trust invested in us, the very high stakes that imply, etc. I'm sure it would mean at the very least getting fired and the ruination of one's career. Possibly even prison. Just saying that for large groups of people, they probably have a lot of reasons to give pause before announcing to the world how something got screwed up.

    Kindest Regards, Rod Connect with me on LinkedIn.

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply