Give Up on Natural Primary Keys

  • hmbacon wrote:

    OH is actually an alfa surrogate for Ohio.

    Not by definition.  Part of the definition of a surrogate key is that you should be able to derive NO information by examining just the key.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden wrote:

    nova wrote:

    Don't get distracted by the arbitrary idea that one key or other has to be primary key. In practical terms it doesn't matter which of the keys in a table is called "primary". What matters is what keys you have and how you use them.

    It makes a huge difference when you trying to get developers and other people to use one or the other and how they use it.

    Why? Developers can't just work with surrogates and if they think that one primary key is all that matters then they ought to be educated otherwise. Applications normally have to be aware of and use the non-surrogate keys in a table because those are the keys used for searches and presented to end users. Users need to be aware of what keys to use in the real world in order to use data effectively. So if, as previously suggested, you do use the surrogate key as primary then you'll certainly want the users and developers to be aware of the non-primary keys. In that sense the primary-ness of one key is potentially a hindrance in my experience. Most developers and users don't need to be aware of the actual primary key. Very likely they get their understanding of the database from documentation which may highlight other keys entirely.

  • I agree that some developers do not understand natural keys.  I have had developers tell me that they can just use the surrogate key to get a specific row.  My questions was, "How do you get the surrogate key?"  I have an idea that for every table created there should be a function to check/select a specific row based on the natural  unique identifier.  Then the developers would not need to know the columns in the natural unique identifier, it would be in the parameter list of the function.  The function would return null if the key did not exist or the surrogate key if it does.  The convention would be not to create another row if an id is returned.   This function could be generated from the unique key constraint defined in the data dictionary.  This will not prevent duplicates in all cases because it is possible to get around the defenses.  How many of us have accounts at web sites that we don't use very often and when we are in a hurry to login we create another account with a number added, a different email address or a different username.  This is a problem with people and our names.

  • I went back and re-read the original article to make sure my comments have not been off topic.  The title is provocative but missleading.  The opinion is really about using natural unique identifiers (PK, UK, key who cares for this discussion) that use PII or GDPR data and are used in foreign key columns.  In that case, the sensitive data gets spread all over the place.  Using a surrogate key is valuable for minimizing this proliferation of sensitive data but is does not eliminate the necessity of a natural unique identifier in ALL tables.  Surrogate keys are optional.

    Marcus

Viewing 4 posts - 106 through 108 (of 108 total)

You must be logged in to reply to this topic. Login to reply