The First Rule

  • This feels silly.  The fact that during design you make the best choice against imperfect data as to what will or will not change.  That said |I don't understand the hubris of putting your head in the sand and stating the data doesn't  and cannot be changes simply because I said it to be/chose it to be.  Primary keys are no different than anything else: your best shot at picking something that shouldn't change: still we all eventually have to settle for something where the PK doesn't change on any regular basis.  Demanding that everything be perfect though - not realistic.

    By all means put as much due diligence into picking something stable, but by god - if you find that your data STILL got around your best laid plans and two of the unique rows REALLY represent the same actual thing.... FIX IT!   Shtuff happens, so deal with it in something other than a petulant child type of approach.  Surrogate keys might be a good strategy to mitigate how many times it might happen - but still - even that will get challenged eventually.

    Having data that is so pure it no longer represents what is actually going on.... isn't data any more, it's fiction.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • I dislike them as PKs and I believe using a surrogate is a better choice. Why? If I do change something, I'm not changing it in many places, which is especially problematic as more companies try to ensure safe data in dev/test or compliance with regulations.

    The world has changed. Surrogate keys provide better links and are easy to maintain even when the data changes. The problems you mention of duplicates or other issues can still be handled with constraints or other checks.

    I actually don't want to belabor this point as there is a recent thread on just this topic.  But the main point of the article was to design databases so that they have the least chance of going wrong and if they do to provide the best way to do a proper fix quickly.  The advantages are plain to see.  But so are the disadvantages.  I've seen and lived those disadvantages, and fixing them was not easy.   A lot of entities require a surrogate key.  A table of names is a clear case in point.  But a lot don't, and adding them handicaps the design.  You can overcome some of the disadvantages by unique indexing the column(s).  But you are now creating an additional object to the primary key when a primary key on the column(s) would have been sufficient.  As has been pointed out, you may have to make a join that would have otherwise NOT been necessary.  I would not have a surrogate key on a zip code table.  Zip codes don't really change, although a location might have to change it's zip code.  But the presence of a surrogate key complicates that update.

    I have nothing against surrogate keys in principle.  Per Kimball, I use them exclusively in my OLAP designs.  But in my OLTP designs, if there's a good natural/business key, I will use that.  Not doing so, in my view, makes the database that much more complicated.

     

    • This reply was modified 4 years, 11 months ago by  RonKyle. Reason: Added quotes
  • The main point of the article is that we should expect data to change and allow for this. Not just with database design, but with the ETL, NULL/defaults, and more. It's not about keys, but it's about ensuring that we allow clients to get work done.

    I've seen too many tables requiring all fields, without defaults, or having cumbersome constraints that impact the ability of an application, or a user, to get work done.

  • RonKyle wrote:

    You go ahead and do your de duplication that will inevitably occur with some surrogate only keys.  I've seen these results many times.  I have yet to see a state split.  You throw out the possibility of an event that has not occurred in our lifetime and completely ignore things that have happened in our databases.  Sorry, but this seems ridiculous to me.

    Sorry.  I didn't mean to imply that I rely only on the surrogate keys as a PK.  I also have AK's, which are also unique and are what you call "natural keys".  For me, they exist mostly to enforce the very uniqueness you speak of.  However, I will never use AK's as a reference key because they are not necessarily immutable and they are usually much wider than a surrogate.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden wrote:

    I also have AK's, which are also unique and are what you call "natural keys".  For me, they exist mostly to enforce the very uniqueness you speak of.  However, I will never use AK's as a reference key because they are not necessarily immutable and they are usually much wider than a surrogate.

    +1 🙂

    Far away is close at hand in the images of elsewhere.
    Anon.

  • Coincidentally, I started reading a fiction book last night where an editor was musing on how some computer programs are not forgiving.

    "The old certainties of typescript, of pen and ink, of queries pencilled in along the margins, had been replaced by computer files that usually worked, but could be subverted by a user pressing the wrong key late at night.  Hours of effort might be hidden or even deleted by such a mistake, and not every program had the forgiveness necessary to allow one to undo what had been done." -- The Quiet Side of Passion by Alexander McCall Smith

  • That's a great quote

  • Steve --  Sometimes you can get rid of bad auto-fill data by scrolling down to the bad one and hitting the delete key.

  • That's good to know. I'll try it next time

  • And then there's the change in security or the deletion of a full site with no record but a trail of now broken links...

    Secure Connection Failed

    An error occurred during a connection to seths.blog. PR_CONNECT_RESET_ERROR

    Error code: PR_CONNECT_RESET_ERROR

    The page you are trying to view cannot be shown because the authenticity of the received data could not be verified.

    Please contact the website owners to inform them of this problem.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • That might be your network or connection. The links  work for me.

    https://seths.blog/2019/04/the-first-database-rule/

  • Steve Jones - SSC Editor wrote:

    That might be your network or connection. The links  work for me.

    https://seths.blog/2019/04/the-first-database-rule/

    Might be me, indeed.  I'm running some pretty secure connections.  If the connection doesn't prove themselves to be who they say they are, it won't let me connect.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 13 posts - 16 through 27 (of 27 total)

You must be logged in to reply to this topic. Login to reply