• michael vessey (10/25/2010)


    An interesting article, but there seems to be one aspect that has not been covered.

    The article suggests that there is a slight overhead using a surrogate key (as an extra column is created ), however actually this can lead to a massive saving where foreign keys are concerned.

    Precisely. Until you actually use the PK as an FK 4 levels down you cannot begin to appreciate how much crap is carved out of the great grandchild table by the use of a surrogate key. All of that crap has to be used in joins if you use a natural key. So with a natural key there is a storage overhead as well as performance overhead. And the overhead of the indexes (storage and maintenance) required on all that crap in each succeeding generation.

    The PK is (or is intended to be) nothing more or less than a pointer from parent to child. It is not what defines uniqueness. The column or columns which define uniqueness can change on a moment's notice. The PK/FK should never change. With a surrogate that is in fact the case, it will never change. With a natural key Murphy says it absolutely will change, often and at the worst possible time.;-)

    Surrogate keys were introduced long ago for a very good reason. They allowed us as designers to have a tool which the DB engine uses to maintain referential integrity in the most efficient manner possible, while simultaneously removing the herculean (and impossible) task of selecting a candidate key that would never change. We get to let the db engine do its job while we do ours, defining correctly normalized entities.

    There is nothing about using a surrogate key that removes the responsibility of proper normalization, or defining candidate keys, or of creating the constraints used to enforce uniqueness. That is absolutely part of our job, and if we don't do all of that stuff every single time we are not doing our job. Telling the db engine how to do its job is not my job however.

    Surrogate PK / FK. Set it and forget it.

    jwcolby54