Using IDENTITY as a key column

Question

Using IDENTITY as a key column

Viewing 15 posts - 46 through 60 (of 67 total)

You must be logged in to reply to this topic. Login to reply

Paul White SSC Guru Points: 150468 More actions · Answer 1

Hugo Kornelis (4/20/2010)
SQL Server will not add this index on its own, so SQL Server has no efficient way to retrieve rows based on a known value for PersonID. We can give SQL Server such an efficient way, but that would require additional DDL.

That depends on the number of rows in the table.

I thought we covered this point already

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi

Glenn Dorling SSCertifiable Points: 7972 More actions · Answer 2

Sorry Hugo, but I have to add my voice to the nay-sayers.

For the first part of the question your answer is based on the fact that by default SQL Server will not allow duplicate IDs, but explicitly a duplicate ID can be created.

However, for the second part of the question you're arguing things the other way around. By default SQL Server will not add an index, but there is an explicit way to provide an efficient query plan.

So for the first part you're assuming explicit behaviour but for the second you're assuming implicit behaviour.

By the way though, I agree with the basic premise of the question. There's many myths around identity fields (can never have duplicates, can never have gaps, ...). My "favourite" that we get tripped on far too often is when developers (especially vendors) don't think about what happens when the maximum field value is reached.

Hugo Kornelis SSC Guru Points: 64790 More actions · Answer 3

Glenn Dorling (4/20/2010)
Sorry Hugo, but I have to add my voice to the nay-sayers.
For the first part of the question your answer is based on the fact that by default SQL Server will not allow duplicate IDs, but explicitly a duplicate ID can be created.
However, for the second part of the question you're arguing things the other way around. By default SQL Server will not add an index, but there is an explicit way to provide an efficient query plan.
So for the first part you're assuming explicit behaviour but for the second you're assuming implicit behaviour.
By the way though, I agree with the basic premise of the question. There's many myths around identity fields (can never have duplicates, can never have gaps, ...). My "favourite" that we get tripped on far too often is when developers (especially vendors) don't think about what happens when the maximum field value is reached.

If multiple people say the same, then there has to some truth in it.

When I submitted the question, I thought that the words "duplicate values for PersonID might still occur" (not "will occur" or so) would imply that this part of the question is whether it is possible or not, whereas the words "SQL Server does not have an efficient way" (not "might be given an efficient way" or so) would imply that that part of the question is about access based on the table posted.

Apparently, this was not clear to all. My apologies to all who got confused because of this, and I hope you still learned something from this question.

Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
Visit my SQL Server blog: https://sqlserverfast.com/blog/
SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

Paul White SSC Guru Points: 150468 More actions · Answer 4

Glenn Dorling (4/20/2010)
So for the first part you're assuming explicit behaviour but for the second you're assuming implicit behaviour.

Glenn, I think that's a stretch, and an artificial distinction to make. The first part was entirely clear and accurate, in my personal view.

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi

Toreador SSChampion Points: 11378 More actions · Answer 5

I'd just like to make clear that although I was (I think) the first person to comment on the answer to the second part, I was reflecting negatively on myself, and not the question (which was fine)!

If you apply common sense rather than obsessive pedantry* then it's pretty obvious what's meant, isn't it?

* that's another negative comment against myself, not an attack on others

Paul White SSC Guru Points: 150468 More actions · Answer 6

Toreador (4/21/2010)
...although I was (I think) the first person to comment on the answer to the second part...

Nope!

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi

pjdiller SSC Eights! Points: 977 More actions · Answer 7

Speaking (typing really) as someone who was initially annoyed that they got the question wrong, I'd like to now say that I'm happy with the overall result.

I think it's been demonstrated more than a few times (every time?) that the original question is best as a conversation-starter.

Thanks first to Hugo and then to the rest!

rjv_rnjn Hall of Fame Points: 3023 More actions · Answer 8

pjdiller (4/21/2010)
I think it's been demonstrated more than a few times (every time?) that the original question is best as a conversation-starter.
Thanks first to Hugo and then to the rest!

Agree {we need a smiley update, no "thumbs-up" :angry: } Even for question where I get the answer right, going through the conversations make me realize certain intricacies involved that I didn't think through. Thanks to all for those conversations.

vk-kirov SSCertifiable Points: 7686 More actions · Answer 9

Hugo Kornelis (4/20/2010)
If SQL Server had an auto-tune feature that automatically adds indexes it beliefs will benefit performance, than you would be right. But no current version of SQL Server has such a feature (and based on recommendations I sometimes see from the DTA, that might as well be for the best).

Sometimes SQL Server automatically creates temporary indexes if the optimizer finds out that it will improve query performance.

Ref: Nonclustered Index Spool Showplan Operator http://msdn.microsoft.com/en-us/library/ms189611.aspx

Of course, the index spool operation will not happen in the statement 'SELECT * FROM Persons WHERE PersonID = @PersonID', but it may happen in some statement with many JOINs. And of course, it would be less effective than explicit index creation on PersonID and using that index

Hugo Kornelis SSC Guru Points: 64790 More actions · Answer 10

vk-kirov (4/22/2010)
Hugo Kornelis (4/20/2010)
If SQL Server had an auto-tune feature that automatically adds indexes it beliefs will benefit performance, than you would be right. But no current version of SQL Server has such a feature (and based on recommendations I sometimes see from the DTA, that might as well be for the best).
Sometimes SQL Server automatically creates temporary indexes if the optimizer finds out that it will improve query performance.
Ref: Nonclustered Index Spool Showplan Operator http://msdn.microsoft.com/en-us/library/ms189611.aspx
Of course, the index spool operation will not happen in the statement 'SELECT * FROM Persons WHERE PersonID = @PersonID', but it may happen in some statement with many JOINs. And of course, it would be less effective than explicit index creation on PersonID and using that index

Thanks, vk-kirov. I was not aware of this operator.

From the limited information I could find on it, it will be used if the same data is used at least twice in a query, the second needs only a subset of the rows needed for the first (otherwise a normal spool would be used), and the subset is based on a predicate that is sargable and selective enough to earn back the cost of creating the index.

Do you think you can post a repro script that actually uses this operator? Have you ever seen it used in your code?

Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
Visit my SQL Server blog: https://sqlserverfast.com/blog/
SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

Paul White SSC Guru Points: 150468 More actions · Answer 11

Hugo Kornelis (4/22/2010)
Thanks, vk-kirov. I was not aware of this operator.
From the limited information I could find on it, it will be used if the same data is used at least twice in a query, the second needs only a subset of the rows needed for the first (otherwise a normal spool would be used), and the subset is based on a predicate that is sargable and selective enough to earn back the cost of creating the index.
Do you think you can post a repro script that actually uses this operator? Have you ever seen it used in your code?

You can generate an Index Spool easily with an unindexed Tally table:

SELECT N

INTO #Tally

FROM dbo.Numbers (50000);

GO

SELECT COUNT(*)

FROM #Tally T1

JOIN #Tally T2

ON T2.N < T1.N;

That uses an in-line TVF I keep handy to generate a temporary tally heap - you can use any method that generates a similar object.

Index spools are not all that rare, but the estimated cost of spooling the rows to a worktable, creating the index, and dropping it afterward is quite high, so the optimiser will often choose a different approach, purely on cost grounds (usually a hash join - the example above avoids that by joining on an inequality only).

There are two types of index spool - eager and lazy. The above example generates an eager spool - one that consumes all rows from its input before it can be used to seek with.

A lazy index spool is a bit more like a regular table spool - except it never truncates its worktable (a normal spool only ever caches one result - it truncates its worktable on every rebind).

A lazy index spool accumulates results on every rebind (when the correlated parameter changes). Personally, I'd love a way to be able to hint the use of lazy index spools, but that's another story.

Interestingly, you cannot trust the reported numbers for rebinds and rewinds for an index spool - the internals report a rewind to anything other than the last value seen as a rebind rather than the rewind it actually is. Sad, but true.

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi

vk-kirov SSCertifiable Points: 7686 More actions · Answer 12

Hugo Kornelis (4/22/2010)
Do you think you can post a repro script that actually uses this operator? Have you ever seen it used in your code?

Yes, sometimes I see this operator when I profile and optimize SQL queries. When I wrote that post, I was optimizing a long-running stored procedure, and there even were both lazy and eager index spools in one query. I tried to make a simplified example (the original query contained 3 tables, 1 view, and 2 subqueries), but all what I got was either hash join or table spool

Thanks to Paul for posting a simple example!

Pooja-462303 SSC Veteran Points: 210 More actions · Answer 13

I am really surprised,

I thought I was a newbie but now i have to call myself lower than newbie,

Can you pls give me an instance/how the identity column can have duplicate.

Thank you

Hugo Kornelis SSC Guru Points: 64790 More actions · Answer 14

Pooja-462303 (4/29/2010)
Can you pls give me an instance/how the identity column can have duplicate.

Sure!

Using SET IDENTITY_INSERT ON:

CREATE TABLE Demo1

(Id int NOT NULL IDENTITY,

Other varchar(20)

);

INSERT INTO Demo1 (Other)

VALUES ('First');

SET IDENTITY_INSERT Demo1 ON;

INSERT INTO Demo1 (Id, Other)

VALUES (1, 'Another first?');

SELECT * FROM Demo1;

DROP TABLE Demo1;

Using DBCC CHECKIDENT with RESEED:

CREATE TABLE Demo2

(Id int NOT NULL IDENTITY,

Other varchar(20)

);

INSERT INTO Demo2 (Other)

VALUES ('First');

DBCC CHECKIDENT (Demo2, RESEED, 0);

INSERT INTO Demo2 (Other)

VALUES ('Another first?');

SELECT * FROM Demo2;

DROP TABLE Demo2;

Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
Visit my SQL Server blog: https://sqlserverfast.com/blog/
SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

ChiragNS One Orange Chip Points: 26137 More actions · Answer 15

An iffy question IMHO. You need to make certain assumptions to come to the right/wrong answer.

Are the points being returned?

"Keep Trying"

Using IDENTITY as a key column

Cookies on SQLServerCentral