• sqlvogel (10/21/2014)


    Jeff Moden (10/20/2014)

    I guess I don't understand the problems people are having with temporal data. Type 2 SCDs are easy to setup, implement, and use. As with certain COTS automated testing systems, I wonder if the cure might be worse that the original problem. Even hierarchical data is fairly easy to query for Point-In-Time results using Type 2 SCDs.

    Kimball's "Type 2" pattern tracks changes to attributes of a dimension - apparently not what Ian was trying to achieve. Type 2 is essentially a uni-temporal model only. Because the dimension is usually assumed to be denormalized much of the temporal data that might have been preserved from source is normally lost.

    The Type 2 model (at least as described by RK and as often implemented) is also unsatisfactory when it comes loading historical changes. Implementations and ETL tools that support Type 2 often just assume newly arriving dimension data is always dated at or after the latest (i.e. "current") row in the dimension.

    Interestingly, another limitation of Kimball's Type 2 is also shared by Johnston and Weis's Asserted Versioning approach. Both Type 2 and AV assume only one business key per temporal table (or dimension). A truly general purpose temporal model ought to preserve multiple keys where necessary. Maintaining more than one business key per table is only slightly more tricky than dealing with a single key and of course it does generalise: solve the problem for N keys and you've solved it for just the one.

    I hope Randy will correct me if I'm wrong about Asserted Versioning. I enjoyed his book very much and I recommend it, but it has been a while since I read it.

    Thanks for the info. It does, however, support my quandry. I'm certainly not an expert in temporal science but I don't understand why so many have made it so difficult. It may be that I'm doing something wrong but I just haven't had any of the problems you've sighted above with temporal data especially with TYPE 2 SCDs.

    I will admit that it does take a bit of planning on my part and a "general purpose" solution would eliminate that need but I've also found that just about anything that is truly "general purpose" frequently suffers in the area of performance and/or resource usage for the sake of flexibility. A great example of that is SQL Server itself.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)