SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


t-sql key on a table


t-sql key on a table

Author
Message
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)

Group: General Forum Members
Points: 86041 Visits: 41095
Christian Graus (1/6/2014)
Thanks - good to know.


Just so you don't think that I'm making this up, here's the entry from Books Online under the title of "Creating and Modifying PRIMARY KEY Constraints ". I know I hate it when someone tells me something about SQL Server and can't actually back it up.


The Database Engine automatically creates a unique index to enforce the uniqueness requirement of the PRIMARY KEY constraint. If a clustered index does not already exist on the table or a nonclustered index is not explicitly specified, a unique, clustered index is created to enforce the PRIMARY KEY constraint.



--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)

Group: General Forum Members
Points: 86041 Visits: 41095
Ed Wagner (1/3/2014)
For this table, it is fine if you add a column as IDENTITY (1, 1) and define it as a primary key. It'll be a 4-byte key.

It appears that the combination of the two columns is unique, so the other possibility is to define a natural key using the combination of the two columns. This would be called a composite key because it contains more than one column. There is great debate between using artificial keys and natural keys and I'm not trying to ignite a debate. Ami Levin has a chapter on the debate in "SQL Server Deep Dives Volume 2" and it covers both sides to the argument.

What matters in the decision here is the overall design of the table, the uniqueness of the columns and if the values are ever-increasing. If you add more columns to this table and add nonclustered indexes to them, all nonclustered indexes inherit the values of the clustered index, which increases the size, so you'd want the size of the clustered index to be as narrow as possible. If you add an artificial primary key and the two columns are unique, you should also consider adding a unique constraint to the two columns.

I know I've thrown out a lot of theory here, but as with so very many things, "it depends" strikes again.


All good points. The only thing that you've left out of the "It Depends" scenario is how the table will be used. See my "It Depends" entry a couple of posts up.

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
ScottPletcher
ScottPletcher
SSCertifiable
SSCertifiable (7.9K reputation)SSCertifiable (7.9K reputation)SSCertifiable (7.9K reputation)SSCertifiable (7.9K reputation)SSCertifiable (7.9K reputation)SSCertifiable (7.9K reputation)SSCertifiable (7.9K reputation)SSCertifiable (7.9K reputation)

Group: General Forum Members
Points: 7895 Visits: 7154
Actually, it is a terrible idea to add an identity to this table.

Depending on your specific data, future inserts could be sequential anyway. Even if not, rows are inserted once and typically read 10s, or even 100s, 1000s or more times, particularly for lookup tables, which this certainly seems to be. Besides, tables can of course be rebuilt when truly needed, and rebuilt online for this table (assuming Enterprise or other qualifying edition).

Overall, the biggest performance gain you'll get is to stop using dopey identity clusters by default and instead consider carefully how the table is accessed. I've removed over 6000 indexes and improved SELECT performance dramatically, particularly for joins, the majority of it from just replacing identity clusters with properly-keyed clustered indexes.

SQL DBA,SQL Server MVP(07, 08, 09)[size=2]Prosecutor James Blackburn, in closing argument in the Fatal Vision murders trial: If in the future, you should cry a tear, cry one for them [the murder victims]. If in the future, you should say a prayer, say one for them. And if in the future, you should light a candle, light one for them.[/size]
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)

Group: General Forum Members
Points: 86041 Visits: 41095
ScottPletcher (1/6/2014)
Actually, it is a terrible idea to add an identity to this table.

Depending on your specific data, future inserts could be sequential anyway. Even if not, rows are inserted once and typically read 10s, or even 100s, 1000s or more times, particularly for lookup tables, which this certainly seems to be. Besides, tables can of course be rebuilt when truly needed, and rebuilt online for this table (assuming Enterprise or other qualifying edition).

Overall, the biggest performance gain you'll get is to stop using dopey identity clusters by default and instead consider carefully how the table is accessed. I've removed over 6000 indexes and improved SELECT performance dramatically, particularly for joins, the majority of it from just replacing identity clusters with properly-keyed clustered indexes.



First of all, stop using words like "dopey". There are some very good uses of IDENTITY columns that I use and your comments infer that I'm dopey. Unless you really like pork chops, please keep such comments to yourself. They don't become you as a professional, they add nothing to the conversation, and they can easily be taken as attempts to belittle or as ad hominem attacks.

It certainly is possible that "future inserts could be sequential". If that's true, then that fits one of the commonly accepted best practices of making a Clustered Index "ever increasing" and you would, indeed, not need an identity column for this table.

If, however, they are not entered sequentially, then it's not so much the performance of SELECTs that I'd be concerned with. Rather it would be the performance of the INSERTS. If there are a lot of INSERTs to this table by a lot of concurrent users, there will be massive page splits that can and will cause timeouts at the GUI.

If the table is mostly static and is of manageable size, then I agree, it almost doesn't matter if an occasional page split occurs. That brings up another point. You have to also plan on maintenance requirements and space. If this is a large table, then massive page splits can easily cause the table to double in size due to the space wasted by the page splits and, contrary to popular belief, disk space isn't cheap. Yeah... you could REORGANIZE the Clustered Index on a regular basis, but even though you can do that in an online fashion, it's expensive CPU and IO wise. The other thing is that REORGANIZE is FULLY LOGGED no matter what recovery mode you’re in and REORGANIZE does NOT rebuild the B-TREE.

If you decide to REBUILD a Clustered Index, you might be able to get away with the WITH DROP EXISTING optimization to keep your MDF file from growing. If you can't, then remember that any index over 128 extents (just 8MB), will have its page usage preserved until the new copy of the index has been created. If that's the Clustered Index of a 1TB table, then you better have a free TB+ hanging around to REBUILD it.

The only good part about REBUILDing indexes is that they will be minimally logged in the BULK-LOGGED and SIMPLE recovery modes. Just keep in mind that if a minimally logged operation occurs while in the BULK-LOGGED mode, any log backups made during that timeframe cannot be partially used for true Point-in-Time recovery. You either have to use the whole log backup or you have to stop at a point before it.

You ALSO have to consider the point (prior to 2014) that if a table had ANY blob columns in it, then you can NOT rebuild the Clustered index online. If the table has a blob in it, that automatically means that the Clustered Index cannot be rebuilt online.

So, lots to consider. You can't just say, "Besides, tables can of course be rebuilt when truly needed, and rebuilt online for this table" because you might not actually be able to either because of blobs or because you don't actually have the disk space.

The overall problem with this thread, so far, is that no one but the OP actually knows HOW the table will be used and he hasn't said yet.

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
ScottPletcher
ScottPletcher
SSCertifiable
SSCertifiable (7.9K reputation)SSCertifiable (7.9K reputation)SSCertifiable (7.9K reputation)SSCertifiable (7.9K reputation)SSCertifiable (7.9K reputation)SSCertifiable (7.9K reputation)SSCertifiable (7.9K reputation)SSCertifiable (7.9K reputation)

Group: General Forum Members
Points: 7895 Visits: 7154
Jeff Moden (1/6/2014)
ScottPletcher (1/6/2014)
Actually, it is a terrible idea to add an identity to this table.

Depending on your specific data, future inserts could be sequential anyway. Even if not, rows are inserted once and typically read 10s, or even 100s, 1000s or more times, particularly for lookup tables, which this certainly seems to be. Besides, tables can of course be rebuilt when truly needed, and rebuilt online for this table (assuming Enterprise or other qualifying edition).

Overall, the biggest performance gain you'll get is to stop using dopey identity clusters by default and instead consider carefully how the table is accessed. I've removed over 6000 indexes and improved SELECT performance dramatically, particularly for joins, the majority of it from just replacing identity clusters with properly-keyed clustered indexes.



First of all, stop using words like "dopey". There are some very good uses of IDENTITY columns that I use and your comments infer that I'm dopey. Unless you really like pork chops, please keep such comments to yourself. They don't become you as a professional, they add nothing to the conversation, and they can easily be taken as attempts to belittle or as ad hominem attacks.

It certainly is possible that "future inserts could be sequential". If that's true, then that fits one of the commonly accepted best practices of making a Clustered Index "ever increasing" and you would, indeed, not need an identity column for this table.

If, however, they are not entered sequentially, then it's not so much the performance of SELECTs that I'd be concerned with. Rather it would be the performance of the INSERTS. If there are a lot of INSERTs to this table by a lot of concurrent users, there will be massive page splits that can and will cause timeouts at the GUI.

If the table is mostly static and is of manageable size, then I agree, it almost doesn't matter if an occasional page split occurs. That brings up another point. You have to also plan on maintenance requirements and space. If this is a large table, then massive page splits can easily cause the table to double in size due to the space wasted by the page splits and, contrary to popular belief, disk space isn't cheap. Yeah... you could REORGANIZE the Clustered Index on a regular basis, but even though you can do that in an online fashion, it's expensive CPU and IO wise. The other thing is that REORGANIZE is FULLY LOGGED no matter what recovery mode you’re in and REORGANIZE does NOT rebuild the B-TREE.

If you decide to REBUILD a Clustered Index, you might be able to get away with the WITH DROP EXISTING optimization to keep your MDF file from growing. If you can't, then remember that any index over 128 extents (just 8MB), will have its page usage preserved until the new copy of the index has been created. If that's the Clustered Index of a 1TB table, then you better have a free TB+ hanging around to REBUILD it.

The only good part about REBUILDing indexes is that they will be minimally logged in the BULK-LOGGED and SIMPLE recovery modes. Just keep in mind that if a minimally logged operation occurs while in the BULK-LOGGED mode, any log backups made during that timeframe cannot be partially used for true Point-in-Time recovery. You either have to use the whole log backup or you have to stop at a point before it.

You ALSO have to consider the point (prior to 2014) that if a table had ANY blob columns in it, then you can NOT rebuild the Clustered index online. If the table has a blob in it, that automatically means that the Clustered Index cannot be rebuilt online.

So, lots to consider. You can't just say, "Besides, tables can of course be rebuilt when truly needed, and rebuilt online for this table" because you might not actually be able to either because of blobs or because you don't actually have the disk space.

The overall problem with this thread, so far, is that no one but the OP actually knows HOW the table will be used and he hasn't said yet.



Even if some splits do occur, the table won't "easily double in size", since that would require every page to split.

Automatically adding an identity as the clustering key by default simply is dopey. The clustered index is the most important index on any table, and so it should always be considered carefully.

SQL DBA,SQL Server MVP(07, 08, 09)[size=2]Prosecutor James Blackburn, in closing argument in the Fatal Vision murders trial: If in the future, you should cry a tear, cry one for them [the murder victims]. If in the future, you should say a prayer, say one for them. And if in the future, you should light a candle, light one for them.[/size]
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)

Group: General Forum Members
Points: 86041 Visits: 41095
ScottPletcher (1/7/2014)
Even if some splits do occur, the table won't "easily double in size", since that would require every page to split.

Automatically adding an identity as the clustering key by default simply is dopey. The clustered index is the most important index on any table, and so it should always be considered carefully.


You've apparently not experienced the type of massive splitting that I've seen happen. And yes, the clustered index is the most important index on any table and "it should always be considered carefully". That also means not summarily dismissing things like IDENTITY columns. That would truly be a "dopey" thing to do. ;-)

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search