The First Rule

  • Steve Jones - SSC Editor

    SSC Guru

    Points: 715889

    Comments posted to this topic are about the item The First Rule

  • erwin oosterhoorn

    SSCrazy

    Points: 2009

    Agree, but with the provision that you can view the changes or edits. In our system, you are able to mark data as incorrect but others can still view that data, in case a decision was made related to the data. In those situations, they have to be able to see the previous entry to make amendments based on the change.

    I struggle sometimes to make a change to data input on other systems where it is impossible to get information on how to remove or change it. The worst cases are when you have an old email on a system and you can not make a change as they send the information to said email!

  • steveo250k

    SSC Veteran

    Points: 260

    Are you implying we need to be able to change the Primary Key?!?  Never.  You even mention a dislike for natural attributes as primary keys.  Of course.  Because natural attributes can change and primary keys should not.  Changing a primary key breaks the relational aspect of a RDMS.

    With the primary key fixed, any other value (attribute) of the set (a.k.a., record) can, and should be allowed to change.  Every table in every database I create has a minimum of 3 columns: Primary Key, Create Date, Modified Date.  I expect the data to change.  But, the Primary Key will never change.

    It seems to me the problem is not the database, it's the application that uses the database.  It's not a DBA's problem.  It's an Application Developer's problem.  They failed to properly gather requirements.  They, apparently, only coded the happy path.

  • Jeff Moden

    SSC Guru

    Points: 994680

    Knuth once wrote the parable of "Premature optimization is the root of all evil".  That has been interpreted (actually, improperly warped) by many as leaving in place NUMERIC(18,0) for integers of all flavors and NVARCHAR(256) even for a "Y/N" column and arguing that it's OK to do based on a really silly interpretation of the parable.  It's as bad as the people that think it's ok to make every VARCHAR() a VARCHAR(MAX).  Those people don't understand that the parable isn't saying that it's OK to violate good design rules.

    I can see people misinterpreting this "First Rule" notion (and I agree with this "First Rule") as it being OK to be further lackadaisical in designing something that works well from the git.

    Of course, when the proverbial poo hits the fan and performance tanks on the database, they'll be quick to say something equally stupid like "See? You wouldn't have this problem if we used NoSQL (or some other flavor-of-the-day method)".

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
    "If you think its expensive to hire a professional to do the job, wait until you hire an amateur."--Red Adair
    "Change is inevitable... change for the better is not."
    When you put the right degree of spin on it, the number 3|8 is also a glyph that describes the nature of a DBAs job. 😉

    Helpful Links:
    How to post code problems
    Create a Tally Function (fnTally)

  • RonKyle

    SSC-Dedicated

    Points: 31457

    This is one reason I dislike many natural keys as PKs.

    I don't understand how this follows from the discussion..  First, there's not a whole lot of cases where there is a real natural key.  But as I said recently in another post, OH for Ohio is a perfectly good natural key, and it is not going to change.  SKU codes may be one example, but most businesses would add new ones rather than changed existing ones.  Many other natural keys are business keys that function as surrogates, such as an order number.  Simply forgoing these brings in the chance of different errors, such as duplicates that aren't caught.  I've seen these and had to de dup many times.  So please stop with the "no natural keys" to avoid issues because you are really trading one set of possible issues for a different set.  And in my view updating a natural key is a rare but doable item.  De duping can be very difficult.

  • mark.magagna

    Valued Member

    Points: 74

    "OH for Ohio is a perfectly good natural key, and it is not going to change." Until it does - states have split quite a few times, e.g. Virginia and West Virginia, North and South Carolina, etc.

    I think "natural" keys are misleading - they are all subject to change. Instead of trying to find a primary key in the attributes, just assign an arbitrary number and be done with it - that way the only changes will be those you control.

  • Jeff Moden

    SSC Guru

    Points: 994680

    RonKyle wrote:

    This is one reason I dislike many natural keys as PKs.

    I don't understand how this follows from the discussion..  First, there's not a whole lot of cases where there is a real natural key.  But as I said recently in another post, OH for Ohio is a perfectly good natural key, and it is not going to change.  SKU codes may be one example, but most businesses would add new ones rather than changed existing ones.  Many other natural keys are business keys that function as surrogates, such as an order number.  Simply forgoing these brings in the chance of different errors, such as duplicates that aren't caught.  I've seen these and had to de dup many times.  So please stop with the "no natural keys" to avoid issues because you are really trading one set of possible issues for a different set.  And in my view updating a natural key is a rare but doable item.  De duping can be very difficult.

    I totally agree with everything except the bold thing above... Yes, I agree it's "doable", but it's not as easy as many would have you believe on a mature system if it has been used as a "key" in a shedload of other tables especially if people don't have the "religion" about properly naming columns.

    As for it's rarity... perhaps on some people's systems but we've have to go through it about 8 or so times now (I've lost count) in the last 8 years and it's always a weeks-long task because people also keep adding and changing tables.  Changing monster audit tables is a huge task, as well (some are 1/2TB in size)

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
    "If you think its expensive to hire a professional to do the job, wait until you hire an amateur."--Red Adair
    "Change is inevitable... change for the better is not."
    When you put the right degree of spin on it, the number 3|8 is also a glyph that describes the nature of a DBAs job. 😉

    Helpful Links:
    How to post code problems
    Create a Tally Function (fnTally)

  • RonKyle

    SSC-Dedicated

    Points: 31457

    You go ahead and do your de duplication that will inevitably occur with some surrogate only keys.  I've seen these results many times.  I have yet to see a state split.  You throw out the possibility of an event that has not occurred in our lifetime and completely ignore things that have happened in our databases.  Sorry, but this seems ridiculous to me.

  • dbishop

    Mr or Mrs. 500

    Points: 565

    "OH for Ohio is a perfectly good natural key, and it is not going to change." Until it does - states have split quite a few times, e.g. Virginia and West Virginia, North and South Carolina, etc.

    I'm pretty sure that they likely didn't use abbreviations for states back in 1712 when North and South Carolina split. They certainly didn't have databases with primary keys back them. And even in 1861, when Virginia split, if they did use abbreviations, they just added one (WV) and likely didn't change Virginia. Regardless, considering that considering that state abbreviations are both ANSI and ISO standards, they are not likely to change. I'd much rather see San Diego - CA - 92127 in an address table than San Diego - 5 - 92127. Just one more JOIN that needs to be done to get the information.

    As I see it, there is nothing wrong with "immutable" natural keys. LastName + FirstName + MiddleName as a natural key? NEVER!!!!!

  • mark.magagna

    Valued Member

    Points: 74

    So you don't see California splitting any time soon?

    This is a side point - a state is not going to be a key for anything other than itself. To track e.g. a facility, you generally need a state (which I will grant you will not change often) plus other attributes - and picking other attributes that will change as often as state will be pretty difficult.

  • Rod at work

    SSC-Dedicated

    Points: 33140

    I agree with you, Steve. Unfortunately, I also see some systems which make changing data extremely hard. I've spoken about it here before. I'm pleased to say that it has improved. It took years to improve it, but it has improved.

    Kindest Regards, Rod Connect with me on LinkedIn.

  • dbishop

    Mr or Mrs. 500

    Points: 565

    So you don't see California splitting any time soon?

    Maybe seceding. Then we can just throw CA away. Or perhaps sliding into the ocean, then we can just throw CA away 🙂

  • Steve Jones - SSC Editor

    SSC Guru

    Points: 715889

    RonKyle wrote:

    This is one reason I dislike many natural keys as PKs.

    I don't understand how this follows from the discussion..  First, there's not a whole lot of cases where there is a real natural key.  But as I said recently in another post, OH for Ohio is a perfectly good natural key, and it is not going to change.  SKU codes may be one example, but ...

    I dislike them as PKs and I believe using a surrogate is a better choice. Why? If I do change something, I'm not changing it in many places, which is especially problematic as more companies try to ensure safe data in dev/test or compliance with regulations.

    The world has changed. Surrogate keys provide better links and are easy to maintain even when the data changes. The problems you mention of duplicates or other issues can still be handled with constraints or other checks.

  • Steve Jones - SSC Editor

    SSC Guru

    Points: 715889

    dbishop wrote:

    As I see it, there is nothing wrong with "immutable" natural keys. LastName + FirstName + MiddleName as a natural key? NEVER!!!!!

    The one thing I'll say in defense of surrogate keys is that very few of us really understand what immutable means in the real world. The things we model are very, very messy, and often little thought is put into requirements or even the way we set up the real world. I've heard too many times that x will never change. And then it does.

     

  • andrew gothard

    SSChampion

    Points: 12295

    As far as natural / surrogate keys go, natural keys *may* be inappropriate as a PK.  However, whether it's the PK or not, if you do not assign a unique index to a natural key - your design is simply wrong.  Not only from the generation of duplicates perspective, but also from a performance perspective, because if you're not going to be searching or filtering on this frequently, I really have no idea how it could actually be a natural key.

    I'm a DBA.
    I'm not paid to solve problems. I'm paid to prevent them.

Viewing 15 posts - 1 through 15 (of 25 total)

You must be logged in to reply to this topic. Login to reply