Surrogate Key vs Natural Key DImension Tables Dimensional Model

  • Could someone please help me explain to a nontechnical person why it is bad practice to use the Natural Keys from the source systems as the Primary key that relates to the Fact Table?

    I appreciate any input on this topic.

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Welsh Corgi (3/24/2012)


    Could someone please help me explain to a nontechnical person why it is bad practice to use the Natural Keys from the source systems as the Primary key that relates to the Fact Table?

    I generally prefer the term "business key" instead of natural key but it means the same thing.

    If the dimension table has to preserve history then the business key of that table would have to include a date (or perhaps a version number). The dates in the fact table won't necessarily match the dates in the key of all the tables it references. So you'd have to add additional dates or version numbers for every table being referenced and use a compound key for all those references. Or you could do without the foreign keys at all and make every join based on a date range query rather than just straight equality. Both options are possible but do add a certain amount of complexity in most cases.

    Having a surrogate key can also help you track changes to business key values or changes to the keys themselves.

    Surrogate keys aren't necessarily essential in a data warehouse but they typically do help simplify joins in marts for presentation/reporting purposes.

  • Welsh Corgi (3/24/2012)


    Could someone please help me explain to a nontechnical person why it is bad practice to use the Natural Keys from the source systems as the Primary key that relates to the Fact Table?

    I appreciate any input on this topic.

    Not really easy to explain this to a non-technical person; at the end of the day you will have to say - in a polite way - that there are best practices on the industry and the company hired you to apply such best practices.

    Having said that, IBM has done a good job detailing the main reason why to use surrogate keys on dimensional modeling, please check http://publib.boulder.ibm.com/infocenter/rdahelp/v7r5/index.jsp?topic=%2Fcom.ibm.datatools.dimensional.ui.doc%2Ftopics%2Fc_dm_surrogatekeys.html

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • Thank you.

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Friend, for details check the book -->

    SQL Server MVP Deep Dives, Volume 2

    PART 1 ARCHITECTURE EDITED BY LOUIS DAVIDSON

    1. Where are my keys? by AMI LEVIN

    http://www.manning.com/delaney/

    ~ Lokesh Vij


    Guidelines for quicker answers on T-SQL question[/url]
    Guidelines for answers on Performance questions

    Link to my Blog Post --> www.SQLPathy.com[/url]

    Follow me @Twitter

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply