RE: Using IDENTITY as a key column

SSC Guru

Points: 64780

April 20, 2010 at 1:33 am

Thanks for the feedback, to all involved in this discussion so far.

I appreciate all the kind words and compliments. And I'l try to address the questions and criticism in the rest of this post.

Paul White NZ (4/19/2010)
I am less sure about the wording in the second part of the question, "SQL Server does not have an efficient way to retrieve rows based on a known value for PersonID." because that rather depends on the number of rows in the table.
I don't think it would have been giving too much away to say something like "SQL Server cannot use an index to locate rows based on a known value for PersonID".

I wanted to disagree with this at first, but since several other people have posted similar comments, I'll have to accept that, obviously, this was not clear enough.

However, I don't understand the argument that this depends on the number of rows in the table. Can you explain for what number of rows a SELECT based on WHERE PersonID = @PersonID will be faster without an index on PersonID, and why?

ST_John (4/20/2010)
Hang on a minute ...
"Even though the IDENTITY property is intended to be used for key columns, SQL Server will not automatically generate a constraint to enforce the uniqueness, so duplicate values (caused by the methods above) will not be banned from the table. SQL Server will also not automatically create an index to speed up access based on the IDENTITY value (though it will do so when you add the PRIMARY KEY or UNIQUE constraint to enforce uniqueness of the IDENTITY column, as it does for each PRIMARY KEY or UNIQUE constraint). "
the question asked if SQL server had an efficient way to query records, not if it would use that method automatically!

Everything you write is true, but I don't really see the point you try to make. Since the CREATE TABLE statement as given will not create any indexes, there is no efficient access path to query records based on known values for PersonID; only an index will enable those. As long as it's impossible to query efficiently, the question of whether the efficient way will indeed be used is immaterial.

tommyh (4/20/2010)
The heading said "Using IDENTITY as a key column". Key part there being "key". Now combined with the fact that the create table statement was incomplete "-- other columns);". Got me wondering if the author had a missing "primary key (PersonID)" at the end (or something similar). Or that is was implied that there would be code to create a key. Which of course would have inverted the answer.

The comment said "-- (other columns)", not "- (other columns and constraints)". Assuming that I meant to include constraints when I wrote columns is quite a long stretch.

The reaason I chose the title "Using IDENTITY as a key column" is because this is exactly the pattern I do sometimes see - tables with an IDENTITY attribute, with queries that use this column as if it were the key, but no constraints to enforce that key. I guess I could also have chosen the title "Using IDENTITY as a key column without enforcing its uniqueness", but I'm afraid that would have given the question away. Annyway, I'm sorry if the chosen title confused you. My intention was to confuse people with believable but incorrect answer options, and nothing else 😀

dimitri.decoene-1027745 (4/20/2010)
Moreover, a part of the microsoft online help was a bit misleading on the subject:
This is because the IDENTITY property is guaranteed to be unique only for the table on which it is used.
How should I interpret "identity property" in this sentence?

To answer the actual question first - you can interpret it exactly as you did; it refers to the property that is assigned by adding "IDENTITY" to a column in a CREATE or ALTER TABLE statement.

I had to google for the page where you took the quote from to see it in its context. I found it at http://msdn.microsoft.com/en-us/library/ms191131.aspx; if you found it somewhere else please give me a link and I'll review it.

I would not go as far as to say that the text there is incorrect, but it is incomplete and might cause confusion. What the IDENTITY property does give you is a "limited" uniqueness guarantee. Values generated by the IDENTITY property will be unique, as long as you never tamper with them. The two ways of tampering I know of are IDENTITY INSERT and DBCC CHECKIDENT ... RESEED (therer might be more, so please don't take this as an authorative complete list). As long as you never tamper with the IDENTITY values in any way, you can rely on the uniqueness of generated IDENTITY values. At least, as long as you limit your scope to a siingle table; as soon as multiple tables are involved, IDENTITY values will no longer be unique - and that last sentence is what the quoted fragment from the online help is trying to tell you.

I hope this clarifies the confusion.

Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
Visit my SQL Server blog: https://sqlserverfast.com/blog/
SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/