• What we're really talking about here is disaster recovery. Disaster doesn't have to mean an earthquake, tornado, or something else on a massive scale. It could mean a temporary power outage on a local scale. We have to understand which systems are critical to the operations of the enterprise and which ones are not. I would guess that the ability for an airline to know their flight schedules is critical.

    All employees like to think their work is high priority. The reality is that very few systems and employees are mission critical. If you have identified the mission critical systems and operations that must be working within a reasonable time period, you can address those first.

    Catching systems that fail in some way that may not be visible or obvious is the job of those who created the system. Any program or script that runs needs to have error recovery in it. You need to have checks and balances within the systems and identify problems early. There may also be a need for the department who uses the systems and data to have checks and balances where they can discover problems early.

    I don't know if American Airlines will have other problems, such as loss of customers, due to the downtime. If systems fail enough, though, trust will fade and the airline could be hurt down the road.

    Tom