I don't think that you can get a 1 minute diagnosis on a newish system.
Continuous improvement and engineering excellence is the way to get there. The cycle I would expect to go through would look something like that below
- Work out what you think can go wrong
- Work out how you would detect it
- Define useful error messages, alerts and communication paths
- Work out how to engineer out the things in #1
Even the best of us get surprised by the way things manage to go wrong in unanticipated ways. The important thing is to do the root cause analysis and feed that into the 4 step process above.
In my experience continuous improvement naturally leads to refactoring and simplification. This makes systems less likely to go wrong in the first place and much quicker to diagnose when they do.
There is a lot to learn from The Clean Coder by Robert C Martin