• Welsh Corgi (3/24/2012)


    Could someone please help me explain to a nontechnical person why it is bad practice to use the Natural Keys from the source systems as the Primary key that relates to the Fact Table?

    I generally prefer the term "business key" instead of natural key but it means the same thing.

    If the dimension table has to preserve history then the business key of that table would have to include a date (or perhaps a version number). The dates in the fact table won't necessarily match the dates in the key of all the tables it references. So you'd have to add additional dates or version numbers for every table being referenced and use a compound key for all those references. Or you could do without the foreign keys at all and make every join based on a date range query rather than just straight equality. Both options are possible but do add a certain amount of complexity in most cases.

    Having a surrogate key can also help you track changes to business key values or changes to the keys themselves.

    Surrogate keys aren't necessarily essential in a data warehouse but they typically do help simplify joins in marts for presentation/reporting purposes.