Database Design Theory - Primary Keys

  • hey all - got a theoretical question.

    Let's say I have a table which has 20 columns. In this table, I have one column which is a primary key - we'll call it ClientID.

    This table has several requests which are launched against it. Most of them use ClientID as the main filter, either through a JOIN or through the WHERE clause. Also, in most cases, there are quite a few fields returned.

    The setup has been working fine, but now there is a need to make it so that the ClientID is not unique anymore. A second column, we'll call it the SetID, has been added, and now the combination of ClientID-SetID is a unique identifier.

    I see three ways of going about solving this problem:

    1. Keep ClientID as the Clustered Index, but remove the Unique clause.

    2. Make a new primary key, with ClientID and SetID as the composite primary key.

    3. Add a new ID field which has an identity specification, make that the primary key.

    Which of these solutions is the best?

  • kramaswamy (5/31/2012)


    hey all - got a theoretical question.

    I see three ways of going about solving this problem:

    1. Keep ClientID as the Clustered Index, but remove the Unique clause.

    This is not a good approach since your primary key is going to change. Presumably you would not need an index on just the ClientID column anymore because you are unlikely to use that to find rows.

    2. Make a new primary key, with ClientID and SetID as the composite primary key.

    This would probably be my choice since this the newly extended unique column set, it would only make sense that it would become the primary key. You can't keep ClientID as the primary key because it will contain duplicate values.

    3. Add a new ID field which has an identity specification, make that the primary key.

    If you use this approach you would still need to add a unique constraint to ClientID, SetID. This means you would have an extra column of data is not providing any real usage. It would eat up disc space and force a second index on the table.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Sean Lange (5/31/2012)


    kramaswamy (5/31/2012)


    hey all - got a theoretical question.

    I see three ways of going about solving this problem:

    1. Keep ClientID as the Clustered Index, but remove the Unique clause.

    This is not a good approach since your primary key is going to change. Presumably you would not need an index on just the ClientID column anymore because you are unlikely to use that to find rows.

    2. Make a new primary key, with ClientID and SetID as the composite primary key.

    This would probably be my choice since this the newly extended unique column set, it would only make sense that it would become the primary key. You can't keep ClientID as the primary key because it will contain duplicate values.

    3. Add a new ID field which has an identity specification, make that the primary key.

    If you use this approach you would still need to add a unique constraint to ClientID, SetID. This means you would have an extra column of data is not providing any real usage. It would eat up disc space and force a second index on the table.

    3. Add a new ID field which has an identity specification, make that the primary key.

    If you use this approach you would still need to add a unique constraint to ClientID, SetID. This means you would have an extra column of data is not providing any real usage. It would eat up disc space and force a second index on the table.

    Actually, some would say this is a valid approach. The Identity column could be used to link ClientID/SetID pairs in other tables (of course this means these values have to be added to those tables as well). There are cons, but one pro is that if down the road another column is added to the ClientID/SetID pair, you don't have to change the primary key, just the unique index defining the user visible unique key (natural key).

  • Yeah I figured the same as well, just wanted to make sure I was correct in that opinion before going ahead with it.

    On a related note, if you have a composite key as the primary key, it does take in to account the order of the columns right? In my example, if I did a query like this:

    SELECT SomeField

    FROM Table

    WHERE ClientID = 123

    It would do a clustered index seek, but if I were to do the following query:

    SELECT SomeField

    FROM Table

    WHERE SetID = 123

    It would do a clustered index scan? And, if I were to create a covering nonclustered index on SetID, would it then do an index seek, or would it determine that a clustered index scan was still the better solution?

  • @Lynn - that was actually something I was thinking about too, for that exact reason. As it stands I'm going to have to modify the other tables one way or another - if I go with the composite key approach, I'll need to add the extra column to the other tables, while if I go with the unique ID approach, I'll need to switch the existing column lookup in the other tables for the new one.

    For all intents and purposes, this aspect will just be a matter of which is more convenient for me, right? Performance-wise, there's no difference if I have a composite key and those same columns as lookups in other tables, vs having a unique ID?

  • Lynn Pettis (5/31/2012)


    Sean Lange (5/31/2012)


    kramaswamy (5/31/2012)


    hey all - got a theoretical question.

    I see three ways of going about solving this problem:

    1. Keep ClientID as the Clustered Index, but remove the Unique clause.

    This is not a good approach since your primary key is going to change. Presumably you would not need an index on just the ClientID column anymore because you are unlikely to use that to find rows.

    2. Make a new primary key, with ClientID and SetID as the composite primary key.

    This would probably be my choice since this the newly extended unique column set, it would only make sense that it would become the primary key. You can't keep ClientID as the primary key because it will contain duplicate values.

    3. Add a new ID field which has an identity specification, make that the primary key.

    If you use this approach you would still need to add a unique constraint to ClientID, SetID. This means you would have an extra column of data is not providing any real usage. It would eat up disc space and force a second index on the table.

    3. Add a new ID field which has an identity specification, make that the primary key.

    If you use this approach you would still need to add a unique constraint to ClientID, SetID. This means you would have an extra column of data is not providing any real usage. It would eat up disc space and force a second index on the table.

    Actually, some would say this is a valid approach. The Identity column could be used to link ClientID/SetID pairs in other tables (of course this means these values have to be added to those tables as well). There are cons, but one pro is that if down the road another column is added to the ClientID/SetID pair, you don't have to change the primary key, just the unique index defining the user visible unique key (natural key).

    I certainly did not say it was not a valid approach. It is indeed a valid approach. My first choice would likely be choice #2 but that may be as much preference as anything. Either #2 or #3 is a valid approach. #1 just simply is not a good path. You certainly bring up a good point about the key changing, yet again, and how #3 would make that easier to deal with.

    I don't think anybody on here can say with certainty which of those two options is best for you scenario. There are so many other things that would affect a decision like this. Kudos to you for thinking it through, coming up with some viable

    options and discussing it.

    Above all keep in mind that any other tables referencing this will need to have either a new column added for SetID (choice #2) or update the values in the existing foreign key column to match your newly created identity values.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • kramaswamy (5/31/2012)


    @Lynn - that was actually something I was thinking about too, for that exact reason. As it stands I'm going to have to modify the other tables one way or another - if I go with the composite key approach, I'll need to add the extra column to the other tables, while if I go with the unique ID approach, I'll need to switch the existing column lookup in the other tables for the new one.

    For all intents and purposes, this aspect will just be a matter of which is more convenient for me, right? Performance-wise, there's no difference if I have a composite key and those same columns as lookups in other tables, vs having a unique ID?

    hehe - you mentioned the new columns and maintaining RI as I was typing. 😛

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Here is my problem with Natural Keys (not saying they are wrong), they tend to not stay unique. Your case in this thread is a good example. I have also worked where this has been a value that could be requested to be changed for security purposes (a student id for example).

  • Lynn Pettis (5/31/2012)


    Here is my problem with Natural Keys (not saying they are wrong), they tend to not stay unique. Your case in this thread is a good example. I have also worked where this has been a value that could be requested to be changed for security purposes (a student id for example).

    Funny you should mention that...I tend to (almost) always use an identity as a PK because it just makes things so easy to work with. You never have to have that moment of pause "What is the key in this table again??".

    That may be the main reason I made the decision I made about how to handle this, because I don't usually have to deal with or even think about it.

    [Now I am hoping CELKO doesn't come along and roast me for this because that is not how "good" sql developers do things or whatever]

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Sean Lange (5/31/2012)


    Lynn Pettis (5/31/2012)


    Here is my problem with Natural Keys (not saying they are wrong), they tend to not stay unique. Your case in this thread is a good example. I have also worked where this has been a value that could be requested to be changed for security purposes (a student id for example).

    Funny you should mention that...I tend to (almost) always use an identity as a PK because it just makes things so easy to work with. You never have to have that moment of pause "What is the key in this table again??".

    That may be the main reason I made the decision I made about how to handle this, because I don't usually have to deal with or even think about it.

    [Now I am hoping CELKO doesn't come along and roast me for this because that is not how "good" sql developers do things or whatever]

    You know, sometimes you actually have to work in the real world and deal with real world scenerios and what is "best" from the Ivory Tower just doesn't work.

    And if he does come, he's all yours. I have instructions to leave enough alone.

  • Lynn Pettis (5/31/2012)


    Sean Lange (5/31/2012)


    Lynn Pettis (5/31/2012)


    Here is my problem with Natural Keys (not saying they are wrong), they tend to not stay unique. Your case in this thread is a good example. I have also worked where this has been a value that could be requested to be changed for security purposes (a student id for example).

    Funny you should mention that...I tend to (almost) always use an identity as a PK because it just makes things so easy to work with. You never have to have that moment of pause "What is the key in this table again??".

    That may be the main reason I made the decision I made about how to handle this, because I don't usually have to deal with or even think about it.

    [Now I am hoping CELKO doesn't come along and roast me for this because that is not how "good" sql developers do things or whatever]

    You know, sometimes you actually have to work in the real world and deal with real world scenerios and what is "best" from the Ivory Tower just doesn't work.

    And if he does come, he's all yours. I have instructions to leave enough alone.

    I have no intention of wrestling with him, I am just hoping he doesn't find this thread because I can already hear his response to my post. 😉

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply