• .. The goal is to have a single truth for each datapoint, if we can achieve that then we can make decisions about whether to copy that data somewhere to simplify usage or to mash it all up at a layer above the database ..


    The CAP theorem applies to attempts at using a central EDW as a Single Source Of Truth. You can't reliably aggregate point in time metrics from multiple distributed data sources without tolerating a certain degree of latency, occasional unavailability, and margin for error. What you're really providing is an [Official Version Of Truth]. For a governmental or corporate enterprise, it not so much important that everyone is operating with the most accurate ideal of truth, but rather that everyone at any moment in time is operating with the same good enough version of truth. For example, it's important that society as a whole accept the official outcome of a political election, even though many folks would argue on procedural or philosophical grounds that the outcome was incorrect. Settling for a margin of error is better than remaining in a constant state of disagreement.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho