Reusing Identities

Question

Reusing Identities

Dinesh Priyankara

SSCommitted

Points: 1681
More actions
February 8, 2003 at 12:00 am

#105590

Comments posted to this topic are about the content posted at http://www.sqlservercentral.com/columnists/dPriyankara/reuseidentities.asp

Viewing 10 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply

Jeremy Valued Member Points: 60 More actions · Answer 1

I don't think your second method is correct. You are looking for the lowest number in the first gap, correct? In that case you'll either have to add an ORDER BY clause to your TOP query, or use "SELECT Min(intNumber)" instead of the TOP clause. The point is that the only way to guarantee that results are ordered - and thus to guarantee that the TOP 1 clause delivers the lowest gap - is to add an order by clause.

Of course, in practice, due to the clustered primary key on the Numbers table you might well witness the correct result every time you try it without the ORDER BY, but there's no certainty. This behaviour might change with differences in the clustered index on the other table, differences in paralellism, or changes in the underlying db engine in future SQL Server versions.

I would either rewrite your method as:

SELECT TOP 1 intNumber

FROM OrderHeader

RIGHT OUTER JOIN tb_Numbers

ON tb_Numbers.intNumber = OrderHeader.intID

WHERE intID IS NULL AND intNumber < = (SELECT MAX(intID) FROM OrderHeader)

ORDER BY tb_Numbers.intNumber

- or -

SELECT Min(intNumber)

FROM OrderHeader

RIGHT OUTER JOIN tb_Numbers

ON tb_Numbers.intNumber = OrderHeader.intID

WHERE intID IS NULL AND intNumber < = (SELECT MAX(intID) FROM OrderHeader)

It would be interesting to see if the differences in performance with these (IMHO) correct methods are still as large, and which of these two alternative methods would be the fastest.

Jeremy

Dinesh Priyankara SSCommitted Points: 1681 More actions · Answer 2

Hi Jeremy,

I agree with you if the OrderHeader table has no index created with IDENTITY column. But as a general rule, we always create clustered or nonclustered unique index.

If no indexes available, Of course you have to use one of your statements. Then there will be slight performance problems but still the method is the winner.

Dinesh Priyankara

mcp mcse mcsd mcdba

Michael R Schmidt Mr or Mrs. 500 Points: 532 More actions · Answer 3

First: With a Clustered PK the data is *always* returned in the natual table order, so order by is not necessary. Second: why do we need to get the min occurence, all we really care about is a missing ident right? so just give me the first ident and I will be happy.

Michael R. Schmidt
Developer

Dinesh Priyankara SSCommitted Points: 1681 More actions · Answer 4

Hi Mike,

Actually what we want is, reuse removed identities. So, whatever the method, whether TOP 1 or MIN, you get desired result. But if you consider about performance, TOP 1 is better.

[font=Verdana]Dinesh Priyankara[font=Verdana]

mcp mcse mcsd mcdba

Andy Warren SSC Guru Points: 119922 More actions · Answer 5

In a high volume environment, wouldnt you run into issues with two users trying to use (re-use) the same key?

Andy

http://www.sqlservercentral.com/columnists/awarren/

Andy
Connect with me on LinkedIn

Dinesh Priyankara SSCommitted Points: 1681 More actions · Answer 6

Hi Andy,

Why can’t we run this insertion process within a transaction?

Dinesh Priyankara

Edited by - Dinesh on 03/06/2003 9:17:45 PM

Andy Warren SSC Guru Points: 119922 More actions · Answer 7

You can of course, and that is the right answer. Downside will be (or may be anyway!) blocking.

Andy

http://www.sqlservercentral.com/columnists/awarren/

Andy
Connect with me on LinkedIn

Razvan Socol SSCarpal Tunnel Points: 4968 More actions · Answer 8

Dinesh wrote:

As per the output, there are 20086 logical reads and it has taken 200 ms for the first method. But in second method there are only 19 logical reads and the execution time is less considerable.
That’s why I selected to continue in my way. But there may be a side that I have not seen but you can see. So, try on this and see whether how this T-SQL solution suit for you.

Actually, I think that you are not considering two things:

1. The overhead of creating a table of 30000 rows is much greater than the difference between the two queries. If you prefer to store this table permanently, you have to consider that it's size will be added to each backup and that you need to adjust the number of rows as more data will be added in the main table.

2. The example from Books Online doesn't assume that the seed and the interval are 1, instead it determines them and this is the main cause of the many logical reads. Try:

SELECT MIN(IDENTITYCOL) + 1 as NextIdentityValue

FROM OrderHeader t1

WHERE IDENTITYCOL BETWEEN 1 AND 32766

AND NOT EXISTS (SELECT * FROM OrderHeader t2

WHERE t2.IDENTITYCOL = t1.IDENTITYCOL + 1)

You will get only about 120 logical reads (instead of about 20000) and the execution time is much better.

Razvan Socol<O>

</O>

G.R.Prithiviraj Kulasingham SSCarpal Tunnel Points: 4105 More actions · Answer 9

I am replying after a looooooooooooong time

On the first method the author is using a not so optimized query. He is using different methods in each queries (sub query vs outer join and Min vs top 1). More than that, Should we declare a variable, get the result into the variable and select the variable?

I tried with Left outer join and the results are really better. (I tried with both statistics time and statistics IO ON)

Select top 1 t1.IDENTITYCOL + IDENT_INCR('OrderHeader') As NextIdentityValue

from OrderHeader t1 Left Outer join OrderHeader t2

ON T1.IDENTITYCOL+IDENT_INCR('OrderHeader')=t2.IDENTITYCOL

Where t2.IDENTITYCOL is null

This is the result:

SQL Server Execution Times:

CPU time = 0 ms, elapsed time = 0 ms.

SQL Server Execution Times:

CPU time = 0 ms, elapsed time = 0 ms.

SQL Server parse and compile time:

CPU time = 0 ms, elapsed time = 2 ms.

2210

Table 'OrderHeader'. Scan count 2210, logical reads 4450, physical reads 0, read-ahead reads 0.

SQL Server Execution Times:

CPU time = 26 ms, elapsed time = 26 ms.

SQL Server parse and compile time:

CPU time = 0 ms, elapsed time = 0 ms.

Later I introduced a variable to get the Indentity increment first the the results are even better

Declare @Increment int

Select @Increment= IDENT_INCR('OrderHeader')

Select top 1 t1.IDENTITYCOL + @Increment As NextIdentityValue

from OrderHeader t1 Left Outer join OrderHeader t2

ON T1.IDENTITYCOL + @Increment = t2.IDENTITYCOL

Where t2.IDENTITYCOL is null

The results are slightly better:

SQL Server Execution Times:

CPU time = 0 ms, elapsed time = 0 ms.

SQL Server Execution Times:

CPU time = 0 ms, elapsed time = 0 ms.

SQL Server parse and compile time:

CPU time = 0 ms, elapsed time = 2 ms.

SQL Server Execution Times:

CPU time = 0 ms, elapsed time = 0 ms.

2210

Table 'OrderHeader'. Scan count 2136, logical reads 4439, physical reads 0, read-ahead reads 0.

SQL Server Execution Times:

CPU time = 7 ms, elapsed time = 7 ms.

SQL Server parse and compile time:

CPU time = 0 ms, elapsed time = 0 ms.

Finally, I changed the min into top 1. the results are almost to my previous results:

SELECT top 1 IDENTITYCOL + IDENT_INCR('OrderHeader') As NextIdentityValue

FROM OrderHeader t1

WHERE IDENTITYCOL BETWEEN IDENT_SEED('OrderHeader') AND 32766

AND NOT EXISTS (SELECT * FROM OrderHeader t2

WHERE t2.IDENTITYCOL = t1.IDENTITYCOL + IDENT_INCR('OrderHeader'))

SQL Server Execution Times:

CPU time = 0 ms, elapsed time = 0 ms.

SQL Server Execution Times:

CPU time = 0 ms, elapsed time = 0 ms.

SQL Server parse and compile time:

CPU time = 3 ms, elapsed time = 3 ms.

2210

Table 'OrderHeader'. Scan count 2210, logical reads 4440, physical reads 0, read-ahead reads 0.

SQL Server Execution Times:

CPU time = 15 ms, elapsed time = 26 ms.

SQL Server parse and compile time:

CPU time = 0 ms, elapsed time = 0 ms.

Cheers,
Prithiviraj Kulasingham

http://preethiviraj.blogspot.com/