Primary Key vs Clustered Index - Very newbie question

  • First my most humble apologies as I am sure most of you are going to roll your eyes in dispair.

    I am basically an Access person whose databases outgrew the Access size limit a long time ago and we moved our data to Sql Server 2005 but basically left the programming as is. Now we have probably about 35 million records more or less and I was thinking it was about time to learn a little SQL Server.

    So I read the recent articles on indexes and found that bad bad me I have no clustered indexes on the major tables which consist of several million records. Probably a bad thing.

    And I decided to make the primary key clustered and got all set to do it last Sunday and "IT" wouldn't let me.

    Now I think I discovered that if there is a primary "KEY" defined - and there is, then this becomes a clustered index. That's what I just read. True??? I do have a primary key on each table. I didn't create them but I did run some automatic performance tools right after we moved the data and that is probably how they got there.

    If this is so, then fine as that is how I would have defined the clustered index, although I might want to change one slightly.

    Do I even want a primary and then a unique index that is identically defined? Is this some sort of magic that I want or is this a very bad thing?

  • A PK is a logical structure. It means the field(s) that uniquely identify each row in the table.

    This is implemented in SQL Server as a unique index. If you set up a PK, you get a unique index on the table.

    Clustered index (CI) v nonclustered index (NCI) is completely separate. Either can be unique. You get one CI on a table because it is the data. A table without a CI is a heap, and doesn't perform as well in queries. The optimizer in SQL Server is built to work with clustered indexes, so it's recommended that you have a CI on the table.

    The choice of a CI is usually made based on query performance. The PK creation defaults to clustered, but this isn't always the best choice. Often we pick the CI based on range queries. Since it's the data, if you query on some range data, often dates, then a CI on that field can be more efficient.

    For the PK, you can have it as a CI or NCI. If you don't think you have a better choice for a table, then leave the PK as a CI.

  • Carolyn Stern (11/19/2009)


    Now I think I discovered that if there is a primary "KEY" defined - and there is, then this becomes a clustered index. That's what I just read.

    By default, when a primary key is created, if there is no existing clustered index on the table, the primary key is enforced by a clustered index. This is only be default, not a requirement

    If this is so, then fine as that is how I would have defined the clustered index, although I might want to change one slightly.

    If you want to change one, drop the primary key (and I would suggest you learn to do this in script as it gives more control than the GUI designer), then create the clustered index that you want, then go back and recreate the primary key.

    Do I even want a primary and then a unique index that is identically defined?

    Highly unlikely. If one is the clustered and one is not then there might be a very occasional reason for doing that, but very unlikely.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Carolyn Stern (11/19/2009)


    First my most humble apologies as I am sure most of you are going to roll your eyes in dispair.

    Not at all.

    A well asked question often leads to well worded answers which help many of us also.

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply