Trusting Systems

  • Comments posted to this topic are about the item Trusting Systems

  • What I constantly find fascinating is the number of management teams who are told that they need to invest more to get the levels of nonfunctional requirements the demand (be it redundancy, scalability, maintainability, flexibility etc) who go on to deny the investment (usually time) then demand that the nonfunctional requirement must be met anyway.

    It seems that every place where this then gets raised as a challenge then highlighted back that instructions were made to cut corners and sacrifice the nonfunctional requirements the following horrible expression gets rolled out: "We are where we are."

    Gaz

    -- Stop your grinnin' and drop your linen...they're everywhere!!!

  • This reminds me of an article I came across earlier in my career... The gist of the piece was that every employee should know how to do their jobs without the use of an automated system. So, that when (not if, but when) the system goes down, the business can continue to function. Granted it won't be at the same productivity level, but business continues none the less. I know there are jobs now, where that's just not possible. However, I think, that too many workers don't know how to do their jobs without the aid of a computer. And, I think, there's real value in knowing how to actually do your job without a computer, as oppose to relying on one. If you can do it without a computer, that signals to me that you truly understand the process; instead of just knowing which buttons to push.

    Thanks,
    MKE Data Guy

  • Knowledge Draftsman (5/23/2013)


    This reminds me of an article I came across earlier in my career... The gist of the piece was that every employee should know how to do their jobs without the use of an automated system. ...

    Unfortunately many businesses like stores that SHOULD be able to function in a power outage, for example, cannot because they have become completely dependent on automation.

    It's getting worse. With cloud dependency, it's not just enough to get some temporary power, without a network connection the business has no local redundancy to operate (Sandy showed how vulnerable these connections are). Obviously stores etc SHOULD be able to function for a while without communications, but they're no longer designed that way and even with willing employees, nothing can be accomplished. Plenty of other business operations (remote sales offices for example) have become completelyl cloud dependent in the effort to remove computation from the office to other centralized locations.

    ...

    -- FORTRAN manual for Xerox Computers --

  • jay-h (5/23/2013)


    Unfortunately many businesses like stores that SHOULD be able to function in a power outage, for example, cannot because they have become completely dependent on automation.

    Barcode scanners basically started killing this in the early 90's and it has only gotten worse.



    ----------------
    Jim P.

    A little bit of this and a little byte of that can cause bloatware.

  • I know most will disagree with what I will say but that's what I found through out my experiences.

    If there's no money tag associate with something (downtime, results, etc) then it's none existent for management.

    Therefore each time we had plan something, we translated it into dollar cost or man hour (which is then translated into dollar cost by management). Still an approximate but it got the attention it deserved only because money is involved, they have something "tangible" to look at.

    no money -> no worry (don't care)

    some money -> some worry (will care to some extent)

  • Knowledge Draftsman (5/23/2013)


    This reminds me of an article I came across earlier in my career... The gist of the piece was that every employee should know how to do their jobs without the use of an automated system. So, that when (not if, but when) the system goes down, the business can continue to function. Granted it won't be at the same productivity level, but business continues none the less. I know there are jobs now, where that's just not possible. However, I think, that too many workers don't know how to do their jobs without the aid of a computer. And, I think, there's real value in knowing how to actually do your job without a computer, as oppose to relying on one. If you can do it without a computer, that signals to me that you truly understand the process; instead of just knowing which buttons to push.

    I think this was fine advice 20, or even 10, years ago. Now we have so many people in the workplace that have never done the job without a computer. I'm not sure of the solution here, but at some point I expect that lots of what we do just won't work without a computer and we may need to delay/shut down parts of our business until the computers are fixed.

  • Steve Jones - SSC Editor (5/23/2013)


    Knowledge Draftsman (5/23/2013)


    This reminds me of an article I came across earlier in my career... The gist of the piece was that every employee should know how to do their jobs without the use of an automated system. So, that when (not if, but when) the system goes down, the business can continue to function. Granted it won't be at the same productivity level, but business continues none the less. I know there are jobs now, where that's just not possible. However, I think, that too many workers don't know how to do their jobs without the aid of a computer. And, I think, there's real value in knowing how to actually do your job without a computer, as oppose to relying on one. If you can do it without a computer, that signals to me that you truly understand the process; instead of just knowing which buttons to push.

    I think this was fine advice 20, or even 10, years ago. Now we have so many people in the workplace that have never done the job without a computer. I'm not sure of the solution here, but at some point I expect that lots of what we do just won't work without a computer and we may need to delay/shut down parts of our business until the computers are fixed.

    At a minimum, at least, the operation should be workable for a length of time on emergency power, off the grid. Depending on full time cloud connection is a major weakness.

    ...

    -- FORTRAN manual for Xerox Computers --

  • American Airlines is not likely to suffer very much from said outage. They may be the best or only choice for certain destinations. People are not going to give up their frequent flyer miles either. And airlines don't really seem to care about customer satisfaction when it comes to fees for just about everything.

    But this is still a good reminder of why we need to address problems and not hope that they just go away, because they usually don't. Just this week, we are looking into a SQL Agent job that didn't work, but successfully completed. This caused some billing for that time period not to get done, and was only discovered on the financial side months later. It is now taking a lot of work to recreate everything, as a lot of files involved are long past their retention period.

    Of course maintaining this attitude also requires management that is willing to deal with things that are not already official objectives. In a past job, this was not the case, and any DBA who discovered a problem was blamed for not having fixed it. Needless to say, everything was buried well below the management level, resulting in many hours of work to fix them later:angry:

  • What we're really talking about here is disaster recovery. Disaster doesn't have to mean an earthquake, tornado, or something else on a massive scale. It could mean a temporary power outage on a local scale. We have to understand which systems are critical to the operations of the enterprise and which ones are not. I would guess that the ability for an airline to know their flight schedules is critical.

    All employees like to think their work is high priority. The reality is that very few systems and employees are mission critical. If you have identified the mission critical systems and operations that must be working within a reasonable time period, you can address those first.

    Catching systems that fail in some way that may not be visible or obvious is the job of those who created the system. Any program or script that runs needs to have error recovery in it. You need to have checks and balances within the systems and identify problems early. There may also be a need for the department who uses the systems and data to have checks and balances where they can discover problems early.

    I don't know if American Airlines will have other problems, such as loss of customers, due to the downtime. If systems fail enough, though, trust will fade and the airline could be hurt down the road.

    Tom

  • OCTom (5/23/2013)


    All employees like to think their work is high priority. The reality is that very few systems and employees are mission critical. If you have identified the mission critical systems and operations that must be working within a reasonable time period, you can address those first.

    Tom

    And any employee that is mission critical should strive to find a way not to be.

    It is better for their health and well being as the employee can have the down time to vacation and take care of the personal stuff without expecting to be on call. It is also better for the company because if that employee was to be hit by a beer truck what would the company do.



    ----------------
    Jim P.

    A little bit of this and a little byte of that can cause bloatware.

  • Knowledge Draftsman (5/23/2013)


    This reminds me of an article I came across earlier in my career... The gist of the piece was that every employee should know how to do their jobs without the use of an automated system. So, that when (not if, but when) the system goes down, the business can continue to function. Granted it won't be at the same productivity level, but business continues none the less. I know there are jobs now, where that's just not possible. However, I think, that too many workers don't know how to do their jobs without the aid of a computer. And, I think, there's real value in knowing how to actually do your job without a computer, as oppose to relying on one. If you can do it without a computer, that signals to me that you truly understand the process; instead of just knowing which buttons to push.

    As system professionals I think most of us would be able to implement limited hacked procedures at short notice with off line systems - (web systems apart). Mainly because we can hardly design systems without really understanding the guiding principles of the system.

    My default position on one system critical application is make sure the machines are on and recording even if nothing is being calculated. Results can be calculated later from the raw information in excel if needs be but if it ain't recording all the computing power in the world won't help. If I couldn't do that I'd be reduced to smiling and being very calm while quitely informing responsible parties that well nobody is going to die (I would press for multiple true backup alternatives in systems were this was an issue). In some applications I may be able to revert to paper and pen but that won't be possible in a lot of cases.

    As fewer and fewer of us make systems for greater and greater numbers of users there is a risk of the end users not understanding the underlying principles because the level of abstraction is so great.

    I think this was a problem with the financial crisis. Management didn't fundamentally understand the weaknesses inherent within the procedures the systems promoted. Surely the auditors should have picked up on this. Sadly I think they were so abstracted from the ground level that the combined weight of the accounting world sailed right across the reef for years.

  • Knowledge Draftsman (5/23/2013)


    However, I think, that too many workers don't know how to do their jobs without the aid of a computer.

    Ironically, I've found a great number of workers that don't know how to do their jobs even when the computer is up and running. 😛

    As a wise-man once said, anyone can make a mistake but to really screw something up, you need a computer. 😀

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply