Primary Keys are necessary for good database design

  • I have come accross an interesting database design where some of its tables (not central) have not primary keys at all. I am talking about MS SQL 2008. Nearly all of them have ID column, where most of them have identity (isIndentity) and Indexable set to Yes.

    Some of them do not have even an identity column. For example one has not PK at all. It has ID column, which is not an identity coulmn at all, but it is Foreign Key. Something like:

    CONSTRAINT fk_MyConstrain FOREIGN KEY (ID)

    REFERENCES OtherTable(ID).

    Another example is a table with ID as identity coulmn (Identity Specification, identity increment and Indexable set to yes), but not PK. It works fine comercially for years. So why PK are needed at all. ID could be genereted as uniques column autogenerated and on top of that we could apply indexes on to get better speed performance.

  • pereubu (12/19/2011)


    I have come accross an interesting database design where some of its tables (not central) have not primary keys at all. I am talking about MS SQL 2008. Nearly all of them have ID column, where most of them have identity (isIndentity) and Indexable set to Yes.

    Some of them do not have even an identity column. For example one has not PK at all. It has ID column, which is not an identity coulmn at all, but it is Foreign Key. Something like:

    CONSTRAINT fk_MyConstrain FOREIGN KEY (ID)

    REFERENCES OtherTable(ID).

    Another example is a table with ID as identity coulmn (Identity Specification, identity increment and Indexable set to yes), but not PK. It works fine comercially for years. So why PK are needed at all. ID could be genereted as uniques column autogenerated and on top of that we could apply indexes on to get better speed performance.

    Just a couple of notes...

    -- You set a PK if you want to enfonce uniqueness on the PK colum/columns - you can also do the same with a unique index.

    -- You are not required by law to include an identity column in your tables.

    -- Some tables like error logs, etc usually do not enforce uniqueness so, the need to enforce uniqueness depends on the target table.

    -- Having said that, it is usual to see either a PK or an Unique Index on a table that it is either part of a 3NF or Dimensional Model design.

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • PaulB-TheOneAndOnly (12/19/2011)


    pereubu (12/19/2011)


    I have come accross an interesting database design where some of its tables (not central) have not primary keys at all. I am talking about MS SQL 2008. Nearly all of them have ID column, where most of them have identity (isIndentity) and Indexable set to Yes.

    Some of them do not have even an identity column. For example one has not PK at all. It has ID column, which is not an identity coulmn at all, but it is Foreign Key. Something like:

    CONSTRAINT fk_MyConstrain FOREIGN KEY (ID)

    REFERENCES OtherTable(ID).

    Another example is a table with ID as identity coulmn (Identity Specification, identity increment and Indexable set to yes), but not PK. It works fine comercially for years. So why PK are needed at all. ID could be genereted as uniques column autogenerated and on top of that we could apply indexes on to get better speed performance.

    Just a couple of notes...

    -- You set a PK if you want to enfonce uniqueness on the PK colum/columns - you can also do the same with a unique index.

    -- You are not required by law to include an identity column in your tables.

    -- Some tables like error logs, etc usually do not enforce uniqueness so, the need to enforce uniqueness depends on the target table.

    -- Having said that, it is usual to see either a PK or an Unique Index on a table that it is either part of a 3NF or Dimensional Model design.

    I agree to Paul and add few more points here.

    •Having Primary Key is not a MUST requirement. It’s a SHOULD requirement.

    •Same is with Identity (Paul already explain that)

    •Having Identity in a table doesn’t mean Primary Key.

    •A Primary Key not need to be an Identity. Any unique column qualifies to be a primary key.

    •Having PKs on all tables doesn’t guarantee a Good Design & vice versa.

    Now you understand that there are various factors that we MUST consider before deciding on PK. However missing PKs on most of the tables is most probably a bad design & needs a review.

  • And to add a few other points:

    * All or some of the columns in the primary key may also be a foreign key.

    * A foreign key does not have to reference the primary key, any unique index can do.



    Ole Kristian Velstadbråten Bangås - Virinco - Facebook - Twitter

    Concatenating Row Values in Transact-SQL[/url]

  • Also, if it's a commercial application (that's been running for years) then the developers of the app are probably enforcing the uniqueness and search/retrieval commands in code. I've seen apps like that which offered different "database" options (sometimes even VSAM-based!) and dealt with the problems with writing to a specific DB interface by avoiding it altogether.

  • In theory there is not difference between theory and practice, in practice there is. No offence intended, but Celco presents nice and valid theory, in practice it's not always the best thing though. Yes Celco, I know I can use a 20 byte variable length serial number as primary key in a production table. However, this would often include having a clustered index on this column, which would cause this 20 byte variable length column to be present in all my nonclustered indexes. This would cause every non clustered index to grow quite a lot, and thus cause a lot of extra IO.

    I know very well of natural keys, yet I DO use identity, not because it's more easy, but because it's often the most efficient for SQL Server. Personally, I'm more interrested in the performance of my systems, than theoretical database designs.



    Ole Kristian Velstadbråten Bangås - Virinco - Facebook - Twitter

    Concatenating Row Values in Transact-SQL[/url]

  • okbangas (12/24/2011)


    In theory there is not difference between theory and practice, in practice there is. No offence intended, but Celco presents nice and valid theory, in practice it's not always the best thing though. Yes Celco, I know I can use a 20 byte variable length serial number as primary key in a production table. However, this would often include having a clustered index on this column, which would cause this 20 byte variable length column to be present in all my nonclustered indexes. This would cause every non clustered index to grow quite a lot, and thus cause a lot of extra IO.

    I know very well of natural keys, yet I DO use identity, not because it's more easy, but because it's often the most efficient for SQL Server. Personally, I'm more interrested in the performance of my systems, than theoretical database designs.

    +1

    When you look for a key, you start with the standards of your industry. I would estimate that this covers ~80% of the properly defined systems -- VIN, ISBN, UPC, EAN, etc. which can be verified and validated via a trusted external source. In the old days, this took time; you can Google it today.

    After that, there is a natural key in the data, such as (longitude, latitude), store, cash register, ticket_nbr, etc which can be verified and validated in the reality of the data. This is ~18% of the cases. if you have no natural key, then your schema is probably wrong.

    It’s absolutely irrelevant here.

  • Sigh, why did I get involved in this case. Now I'm writing about it ... Well, at least it is a good recap of knowledge for myself, and hopefully someone else can make use of it later.



    Ole Kristian Velstadbråten Bangås - Virinco - Facebook - Twitter

    Concatenating Row Values in Transact-SQL[/url]

  • Clecko,

    I read your post and it makes me want to go out and buy all of your books.

    What a great way to start the New Year.:hehe:

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • IMHO, Kimberly Tripp gives the most compelling reason to use an Integer-based PK and why it should usually be the the Clustered Index... performance. Please see the "movie" at the following URL. It's worth every minute.

    http://technet.microsoft.com/en-us/sqlserver/Video/gg508879

    And, yeah... there's nothing wrong with having a surrogate key as the PK and a natural key as an AK. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (12/30/2011)

    ...

    And, yeah... there's nothing wrong with having a surrogate key as the PK and a natural key as an AK. 😉

    Especially when the value of the natural key changes, and you only have to update one row in one table instead of updating 20,000,000 rows in 30 referencing tables.

  • Welsh Corgi (12/30/2011)


    Clecko,

    I read your post and it makes me want to go out and buy all of your books.

    What a great way to start the New Year.:hehe:

    ROFL :hehe:

  • Michael Valentine Jones (12/30/2011)


    Jeff Moden (12/30/2011)

    ...

    And, yeah... there's nothing wrong with having a surrogate key as the PK and a natural key as an AK. 😉

    Especially when the value of the natural key changes, and you only have to update one row in one table instead of updating 20,000,000 rows in 30 referencing tables.

    Don'cha just love it when that happens? 😀

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • It is much better to update the 20M records, since it is a theoretically correct design 😛



    Ole Kristian Velstadbråten Bangås - Virinco - Facebook - Twitter

    Concatenating Row Values in Transact-SQL[/url]

  • CELKO (12/24/2011)


    Please read a book on the Relational model. No table should have an IDENTITY column; it is a locator and not an attribute of the entities in the data model.

    If no table should have an Identity Column then Sequence Numbers should not be used in Oracle or future releases of SQL Server.:laugh:

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Viewing 15 posts - 1 through 15 (of 22 total)

You must be logged in to reply to this topic. Login to reply