PK v Unique Index

  • I user request to get an article, which would explain in details the differences between primary key and unique index, especially from disk i/o point of view, I think it will be of a massive help to many SQL Server Central readers.

  • Steve Jones - SSC Editor (2/20/2015)


    I user request to get an article, which would explain in details the differences between primary key and unique index, especially from disk i/o point of view, I think it will be of a massive help to many SQL Server Central readers.

    Are you sure you're not getting PKs and Unique indexes mixed up with Clustered and Non-Clustered Indexes here?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • That's what a user asked for.

    I think that there shouldn't be any difference between a PK and unique index, whether CI or NCI, but perhaps there's some misunderstanding out there.

    Be good to just have someone look at this and document it.

    Not sure about the CI v NCI part. That's covered in places, but I'll look over some of our indexing pieces and see what I think.

  • Steve Jones - SSC Editor (2/20/2015)


    I think that there shouldn't be any difference between a PK and unique index, whether CI or NCI, but perhaps there's some misunderstanding out there.

    Actually there's a difference in that columns in a unique index can contain nulls. In the case of a single column index this is no big deal, because it can have at most one null. But with a multi-column index the number of nulls isn't really limited, the only limit is that two nulls are considered equal when determining uniqueness. I'm fairly sure that this is actually of no practical interest as far as disk io is concerned, though.

    Tom

  • TomThomson (2/20/2015)


    Steve Jones - SSC Editor (2/20/2015)


    I think that there shouldn't be any difference between a PK and unique index, whether CI or NCI, but perhaps there's some misunderstanding out there.

    Actually there's a difference in that columns in a unique index can contain nulls. In the case of a single column index this is no big deal, because it can have at most one null. But with a multi-column index the number of nulls isn't really limited, the only limit is that two nulls are considered equal when determining uniqueness. I'm fairly sure that this is actually of no practical interest as far as disk io is concerned, though.

    Wording it slightly differently, primary key column cannot be nullable, unique index column can. As for the IO goes, there is absolutely no difference between a non-clustered primary key and a unique index.

    😎

  • I guess my point is that while there is the difference in Nullability, an NC-PK isn't really going to have any I/O advantage over a Unique NCI as the user thinks. The confusion between "PK" and "Clustered" may be exactly what is article-worthy here because a lot of folks think that a PK has to be the Clustered Index.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (2/21/2015)


    I guess my point is that while there is the difference in Nullability, an NC-PK isn't really going to have any I/O advantage over a Unique NCI as the user thinks. The confusion between "PK" and "Clustered" may be exactly what is article-worthy here because a lot of folks think that a PK has to be the Clustered Index.

    Good point! Would it be worth while touching on the storage difference between CL-PK vs. NC-PK, Unique Index and Unique Constraint or rather that the latter three are all implemented as NC indices? Somehow have the recollection that this has already been covered.

    😎

  • Eirikur Eiriksson (2/21/2015)


    Jeff Moden (2/21/2015)


    I guess my point is that while there is the difference in Nullability, an NC-PK isn't really going to have any I/O advantage over a Unique NCI as the user thinks. The confusion between "PK" and "Clustered" may be exactly what is article-worthy here because a lot of folks think that a PK has to be the Clustered Index.

    Good point! Would it be worth while touching on the storage difference between CL-PK vs. NC-PK, Unique Index and Unique Constraint or rather that the latter three are all implemented as NC indices? Somehow have the recollection that this has already been covered.

    😎

    Heh... just about everything has already been covered on this site in the form of an article. I sometimes have to remind myself that there's nothing wrong with a fresh perspective. For example, I was really nervous about getting "bad marks" from heavy hitters that already knew what a Tally Table was when I wrote the article on the subject but I also knew that a lot of folks still didn't know what it was. Of those that did, a lot of people didn't understand how it worked nor did they have the understanding that a SELECT is actually like a loop.

    The other thing is that a lot of folks don't say "Today, I'm going to sit down and learn about the differences in the types of indexes" but they will read an article that pops up in the SSC headlines. That's why Steve republishes certain articles on Fridays.

    To answer your questions, yes. I believe that such an article as you've described, especially if it were geared to freshman understanding, is exactly the kind of article I thing Steve is looking for based on the question that he said he received and some of the posts that we continue to see. Of course, it should focus on the question that Steve asked... "PK vs Unique Index".

    I'm actually out of line here, though, because I've taken to speaking for Steve without asking. With that point in mind...

    Steve, is THAT what you're thinking about for an article here?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I think that there are only three points that need to be got over to people on this topic:

    (a) a nonclustered primary key and a non-clustered unique index where the columns are constrained NOT NULL are functionally and performance-wise identical.

    (b) a clustered primary key and a clustered unique index where the colums are constraint NOT NULL are functionally and performance-wise identical.

    (c) when columns in a unique index are not constrained NOT NULL there is a trivial functional difference from the case where they are constrained NOT NULL, but there is still no performance difference from the correstponsing primary key. The functional difference is sometimes useful - sometimes it is useful to allow nulls, sometimes it isn't, somethimes it is usefull to forbid nulls, sometimes it isnt, it depends on whether it is useful to be able to insert things before one knows enough to say exactly what they are or more useful to insist that we know what something is before we insert it; But this trivial difference has no impact whatever on IO performance, or (in fact) on any other aspect of performance.

    I think there are some articles already covering this (I've read some good articles about indexes here; but maybe I wouldn't have noticed if they skipped this extrremely elementary level, so I'm not really certain it's covered). I also think that it's such a basic thing that every DBA or Database developer should be 100% aware of it before they are allowed to even think about schema desig, never mind actually do any. The thing is that I can't envisage an article aimed only at comparing the unique index with a primary key on the same columns being any longer than half a page (and I would probably have trouble making it even that long, despite my regrettable inclination towards pedantic verbosity).

    Tom

  • Tom has a good summary here. If someone would like to tackle a short, simple, intro piece, we'd appreciate it.

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply