• For those with RAID; how many actually run consistency checks on a regular basis, to detect single drive corruption?

    For single drives, how often is an online disk check (much less an offline disk check) done?

    For data, how many people actually generate checksum data (if you use Zip, raise your hand) to detect some corruption*, much less ECC data (par2 or ICE or DVDisaster, etc.) to detect and correct some corruption*?

    *I almost never see real corruption; the typical "I can't explain this issue [optional: but a reinstall fixed it]; some file must have gone corrupt" that I actually check out shows no corruption at all; more commonly is a configuration change of some stripe.

    I would also note that planning for regular failure is both very expensive and very limiting; most products don't support truly transparent high availability with 0% downtime at all. Big mainframe hardware (and perhaps midrange systems) does; commodity x86 based hardware and software typically doesn't, with a few exceptions.