The Identity Debate

  • I personally think that Primary Keys should be enforced by default on "regular" UNIQUE, non-clustered indexes. If it so happens that the PK and the UCI happen to match - that's great, but I want that to be thought out, not dummied in.

    Matt,

    ABSOLUTELY agreed. And I still believe that wizards are in part at fault for making (allowing?) people to make bad decisions by assuming that the default is right in every case. I realize you can't legislate values, and you can't make people read and understand the real impacts of the choices, but I'm doomed to hope that if the choice is presented without a default and you have to make a choice, you might consider the real impacts of that choice. If you can't discern the impacts or be bothered to look them up before committing yourself, maybe you should consider another line of work.

    Perhaps my attitude could be called old-school. Back in the 4.x days, you did a lot from the command line. Wizards were in D&D, not SQL, and given the difficulty of changing your mind later, you had to have a better clue of what was going on. Wizards may ease some particularly tedious tasks, but is it possible they also dumb-down the interface?

    And what you say basically goes along with what Jeff says in his post right after yours, in a fairly concise description of the conceptual difference. Sometimes logical and physical mesh, but the general case is anything but that simple.

    ------------
    Buy the ticket, take the ride. -- Hunter S. Thompson

  • But there's a default in code as well. If you don't specify it, meaning just create an index, it's non-clustered.

    I agree with you that people need to think, but how do you present that. There's always a default.

  • "It (always) depends", and I've found it's poor performance and complaints from users that tends to focus the DBA's mind.

    After a few episodes, thinking begins.

    P

  • Steve Jones - Editor (2/14/2008)


    But there's a default in code as well. If you don't specify it, meaning just create an index, it's non-clustered.

    I agree with you that people need to think, but how do you present that. There's always a default.

    If I had a choice for the default setting - that would be it (make the PK non-clustered by default).

    I'll go one further (probably incurring untold wrath for this one)- I'd prefer that the PK and the Clustered Key not be allowed to be implemented on the same index even IF it makes sense to put them both on the same values, so that when I need to change one or the other, I don't need to drop both....And I will just say - I know I KNOW this would be bad with our current implementations out there, so I don't do this for real.

    I would just prefer not to open Pandora's box ( by temporarily dropping DRI) just to help improve reporting some amount.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Imho identities are fine keys when order of insert is close to order needed on select.

    (minimum bits to define uniqueness and order)

    If the natural key predicates of the majority selects define an order very much not of chronological entry and select is many times more significant than insert; they are intuitively a poor choice for a primary clustered key. (due to poor clustering for retrieval or the higher likelihood of the row being on a different page)

    Just so happens that the order of many users needs is a small current subset (single CRUD) then aggregate. For Example Analytics, where aggregate is more common than select as inserted, you end up with very different clustering and indexing.

  • Good thread - nice work, everyone.

    I typically use identity, clustered PKs on almost every table in my OLTP DBs to start with - picking the right-sized integer data type for the need. I understand "the replicators'" argument about using GUIDs, though. I balance that technique based on end user needs/desires for veiwing and communicating about the data, but will never argue that identities work well in merge replication or updating subscriber trans repl (or similar) scenarios. Also, I can't recall how the indexing issues related to the SQL 2000 generated uuids have translated into 2005...I vaguely recall some ideas about combining uuids and incremental integer values to overcome...but I digress.

    Reasons for surrogate (identity or uuid) are same as other's have argued - simple joins, natural keys change, etc.

    I think it's key (pun intended) to try to put unique constraints on the candidate/natural keys when you can.

    I'm also not opposed to using character codes (~<4 chars) instead of SIDs for simpler lookup tables, in order to supply some sort of semantics in the tables where the lookups have FKs.

    Finally, and perhaps unrelated to the original intent of this thread, I'd like to see standards/efforts for intergating lookup tables with software enumerated types. If the software build generates enums based on DB lookups, it can avoid an actual lookup during OLTP insert/update ops to supply the integer or character code for an FK value, leaving the join only for reporting. Perhaps cached maps also fit into this concept. Is this idea something similar to spt_values?

    OK, one more...I don't use an identity in a FK table when using a one-to-one relationship, whether in a subtyping scenairo or vertical partitioning. Here, the FK=PK.

    Cheers,

    -Peter

  • peter (2/19/2008)


    Good thread - nice work, everyone.

    Also, I can't recall how the indexing issues related to the SQL 2000 generated uuids have translated into 2005...I vaguely recall some ideas about combining uuids and incremental integer values to overcome...but I digress.

    Some people came up with some rather clever ideas for generating custom "GUIDs" that replace portions of the GUID with sequential numbers, current date/time values, etc., to ensure they are generated in some sort of ascending order. SQL 2005 introduced the NEWSEQUENTIALID function (I think that's the name, might want to double-check) to handle this type of thing without the need for a custom solution.

    AFAIK the only real problem with GUIDs and indexes is when they're used as a clustered index, since the entire table has to be reorganized an awful lot when you insert newly generated GUIDs. That is unless you use NEWSEQUENTIALID or come up with a custom solution to do the same type of thing.

  • Identity is evil.

    Using identity as PK is bad.

    Having a table like this:

    CREATE TABLE Countries

    (CountryID INT IDENTITY PRIMARY KEY,

    CountryName VARCHAR(50) NOT NULL)

    If I were Joe Celko I would ask you how do you

    prevent your table from containing

    CountryID CountryName

    1 Canada

    2 Canada

    3 Sweden

    4 Sweden

    etc

    If you want an incremental counter use:

    SELECT @nextid = coalesce(MAX(id), 0) + FROM tbl (UPDLOCK)

    This must be done in a transaction to prevent two processes getting

    the same ID.

    /m

  • Quite simple really, I'd put an alternate index (unique) or a unique constraint on CountryName.

    😎

  • Let me just say that identity columns is ok but nothing replaces good old natural keys. If your database has been setup right from the start with identity columns then OK but I have seen more than once that you need your trusty primary keys (no identity) on the table to keep your data UNIQUE. But sometimes something happen and you have to make a plan and then in stead of sitting down and properly analyse the problem, people jump in with identity columns. Example: I have a table called Contracts and one called Vehicles which relate to each other by a column called ContractNo. From the start we said that ContractNo must be unique and can never be duplicated. The client came and said that 10 years ago these people use a series of contract numbers and the printers now unknowingly repeated those numbers so instead off having a look at both these tables to see what should be done, the (trusty???) identity column was called in on Contacts table with no relation to the Vehicles table. So, maybe I just want to say that identity columns make people lazy.:P

    :-PManie Verster
    Developer
    Johannesburg
    South Africa

    I can do all things through Christ who strengthens me. - Holy Bible
    I am a man of fixed and unbending principles, the first of which is to be flexible at all times. - Everett Mckinley Dirkson (Well, I am trying. - Manie Verster)

  • And then comes along a merger, and your unique OrderNo/CustomerNo/InvoiceNo whatever is no longer unique as you have to merger both companies data into a single database. It comes down to "It Depends". You have to know when natural keys are acceptable, and when they aren't.

    😎

  • Matt,

    ABSOLUTELY agreed. And I still believe that wizards are in part at fault for making (allowing?) people to make bad decisions by assuming that the default is right in every case...

    Perhaps my attitude could be called old-school... Wizards were in D&D, not SQL...

    Wow, I just realized how much I needed a good laugh (D&D reference)

    This is related to a long conversation I had with a non-Windows friend about how the 'security' pendulum has swung too far in Vista towards giving users options.

    It started with him explaining that his sister had bought their mother a Vista computer - so she could send pictures of the new baby.

    His mother could not understand the security warnings that Vista was giving her, so she kept calling him (a Mac/UNIX guy) to ask which option to choose. Which led him to ask his mother series of questions to get the context of the message so he could make an educated guess - frustrating both of them.

  • I liked the D&D reference too (played it a lot in my younger days!)

    😎

  • Lynn Pettis (2/27/2008)


    I liked the D&D reference too (played it a lot in my younger days!)

    😎

    Me too - like, yesterday:) (All right - RPG, not the dice and board, etc...)

    But I'm older today, so I will put aside my younger ways (until Saturday or so....:w00t:)

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Matt, Online? Which one?

    😎

Viewing 15 posts - 91 through 105 (of 129 total)

You must be logged in to reply to this topic. Login to reply