The First Rule

  • Matt Miller (4)

    SSC Guru

    Points: 124184

    This feels silly.  The fact that during design you make the best choice against imperfect data as to what will or will not change.  That said |I don't understand the hubris of putting your head in the sand and stating the data doesn't  and cannot be changes simply because I said it to be/chose it to be.  Primary keys are no different than anything else: your best shot at picking something that shouldn't change: still we all eventually have to settle for something where the PK doesn't change on any regular basis.  Demanding that everything be perfect though - not realistic.

    By all means put as much due diligence into picking something stable, but by god - if you find that your data STILL got around your best laid plans and two of the unique rows REALLY represent the same actual thing.... FIX IT!   Shtuff happens, so deal with it in something other than a petulant child type of approach.  Surrogate keys might be a good strategy to mitigate how many times it might happen - but still - even that will get challenged eventually.

    Having data that is so pure it no longer represents what is actually going on.... isn't data any more, it's fiction.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • RonKyle

    SSC-Dedicated

    Points: 31457

    I dislike them as PKs and I believe using a surrogate is a better choice. Why? If I do change something, I'm not changing it in many places, which is especially problematic as more companies try to ensure safe data in dev/test or compliance with regulations.

    The world has changed. Surrogate keys provide better links and are easy to maintain even when the data changes. The problems you mention of duplicates or other issues can still be handled with constraints or other checks.

    I actually don't want to belabor this point as there is a recent thread on just this topic.  But the main point of the article was to design databases so that they have the least chance of going wrong and if they do to provide the best way to do a proper fix quickly.  The advantages are plain to see.  But so are the disadvantages.  I've seen and lived those disadvantages, and fixing them was not easy.   A lot of entities require a surrogate key.  A table of names is a clear case in point.  But a lot don't, and adding them handicaps the design.  You can overcome some of the disadvantages by unique indexing the column(s).  But you are now creating an additional object to the primary key when a primary key on the column(s) would have been sufficient.  As has been pointed out, you may have to make a join that would have otherwise NOT been necessary.  I would not have a surrogate key on a zip code table.  Zip codes don't really change, although a location might have to change it's zip code.  But the presence of a surrogate key complicates that update.

    I have nothing against surrogate keys in principle.  Per Kimball, I use them exclusively in my OLAP designs.  But in my OLTP designs, if there's a good natural/business key, I will use that.  Not doing so, in my view, makes the database that much more complicated.

     

    • This reply was modified 1 month, 4 weeks ago by  RonKyle. Reason: Added quotes
  • Steve Jones - SSC Editor

    SSC Guru

    Points: 715889

    The main point of the article is that we should expect data to change and allow for this. Not just with database design, but with the ETL, NULL/defaults, and more. It's not about keys, but it's about ensuring that we allow clients to get work done.

    I've seen too many tables requiring all fields, without defaults, or having cumbersome constraints that impact the ability of an application, or a user, to get work done.

  • Jeff Moden

    SSC Guru

    Points: 994677

    RonKyle wrote:

    You go ahead and do your de duplication that will inevitably occur with some surrogate only keys.  I've seen these results many times.  I have yet to see a state split.  You throw out the possibility of an event that has not occurred in our lifetime and completely ignore things that have happened in our databases.  Sorry, but this seems ridiculous to me.

    Sorry.  I didn't mean to imply that I rely only on the surrogate keys as a PK.  I also have AK's, which are also unique and are what you call "natural keys".  For me, they exist mostly to enforce the very uniqueness you speak of.  However, I will never use AK's as a reference key because they are not necessarily immutable and they are usually much wider than a surrogate.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
    "If you think its expensive to hire a professional to do the job, wait until you hire an amateur."--Red Adair
    "Change is inevitable... change for the better is not."
    When you put the right degree of spin on it, the number 3|8 is also a glyph that describes the nature of a DBAs job. 😉

    Helpful Links:
    How to post code problems
    Create a Tally Function (fnTally)

  • David Burrows

    SSC Guru

    Points: 64521

    Jeff Moden wrote:

    I also have AK's, which are also unique and are what you call "natural keys".  For me, they exist mostly to enforce the very uniqueness you speak of.  However, I will never use AK's as a reference key because they are not necessarily immutable and they are usually much wider than a surrogate.

    +1 🙂

    Far away is close at hand in the images of elsewhere.
    Anon.

  • Marcia J

    SSCertifiable

    Points: 5635

    Coincidentally, I started reading a fiction book last night where an editor was musing on how some computer programs are not forgiving.

    "The old certainties of typescript, of pen and ink, of queries pencilled in along the margins, had been replaced by computer files that usually worked, but could be subverted by a user pressing the wrong key late at night.  Hours of effort might be hidden or even deleted by such a mistake, and not every program had the forgiveness necessary to allow one to undo what had been done." -- The Quiet Side of Passion by Alexander McCall Smith

  • Steve Jones - SSC Editor

    SSC Guru

    Points: 715889

  • Marcia J

    SSCertifiable

    Points: 5635

  • nguntert

    SSCommitted

    Points: 1721

    Steve --  Sometimes you can get rid of bad auto-fill data by scrolling down to the bad one and hitting the delete key.

  • Steve Jones - SSC Editor

    SSC Guru

    Points: 715889

    That's good to know. I'll try it next time

Viewing 10 posts - 16 through 25 (of 25 total)

You must be logged in to reply to this topic. Login to reply