How to maintain an incrementing row id

Question

Post reply

How to maintain an incrementing row id

rice.tx

Mr or Mrs. 500

Points: 559
More actions
July 11, 2013 at 5:30 pm

#144428

This problem is so fundamental, I'm hesitant to ask it, but here goes.
I have a table with a primary key that consists of two fields. The first key field (sessionID) will have a set of rows. The second key field (logID) is simply the row identifier for that set. I could have made logID an identity field, but there is some utility in each set of sessionIDs having logIDs that begin with 1.
This table has a high volume of inserts. What is a best practices way of determining the next logID for a set and inserting a new row, without
1) risking a key violation from a competing transaction?
2) risking deadlocks?
3) and minimizing locking overhead
Sql Server 2005
The transaction isolation level is Read Committed
The code to create the table is:
create table eventLog (sessionID int, logID int, logText varchar(1000), created datetime)
create unique clustered index eventLog_pk on eventLog (sessionID, logID)
The solution below was attempted but recently failed with a key violation. It was the only time the error has been encountered in millions of updates though. I believe the code below failed because the isolation level Read Committed does not protect from phantom rows. I considered using table hint Serializable in the subquery, but felt someone would have a better answer.
declare
@attempts int,
@sessionID int,
@logID int
-- in practice this would be passed in as a parameter
select @sessionID = 1, @logText = 'test'
select @attempts = 0, @logID = isNull(max(logID),0)+1
from eventLog
where sessionID = @sessionID
while @attempts < 5
begin
select @attempts = @attempts + 1
insert into eventLog (sessionID, logID)
select @sessionId, @logID
where not exists
(select 1 from eventLog
where sessionID = @sessionID
and logID = @logID)
if @@rowcount < 1
select @logID = @logID + 1
else break
end

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply

Steven Willis SSCrazy Eights Points: 9893 More actions · Answer 1

Here's something I came up with. Not sure it's bullet-proof, but maybe it will give you

some ideas. Sorry that it's a bit of a kludge and not very elegant... :crazy:

First some sample data:

IF OBJECT_ID('dbo.eventLog') IS NOT NULL

DROP TABLE dbo.eventLog

CREATE TABLE dbo.eventLog

(

sessionID INT NOT NULL

,logID TIMESTAMP NOT NULL

,logText VARCHAR(1000) NULL

,created DATETIME NULL

,PRIMARY KEY CLUSTERED (sessionID, logID)

,UNIQUE (sessionID)

)

;WITH cteSampleData (sessionID,logText,created)

AS

(

SELECT 1001,'George Washington','2013-03-31' UNION ALL

SELECT 1002,'John Adams','2013-02-28' UNION ALL

SELECT 1003,'Thomas Jefferson','2013-02-15' UNION ALL

SELECT 1004,'James Madison','2013-02-01' UNION ALL

SELECT 1005,'James Monroe','2013-01-31' UNION ALL

SELECT 1006,'John Q Adams','2013-01-22' UNION ALL

SELECT 1007,'Andrew Jackson','2013-01-13'

)

INSERT INTO dbo.eventLog

(sessionID,logText,created)

SELECT

sessionID

,logText

,created

FROM

cteSampleData

SELECT

*

FROM

dbo.eventLog

Now of course the following insert will fail (just to set up a base case)

;WITH cteNewData (sessionID,logText,created)

AS

(

SELECT 1001,'Ronald Reagan','2013-05-31'

)

INSERT INTO dbo.eventLog

(sessionID,logText,created)

SELECT

sessionID

,logText

,created

FROM

cteNewData

Msg 2627, Level 14, State 1, Line 1

Violation of UNIQUE KEY constraint 'UQ__eventLog__23DB12CA731B1205'. Cannot insert duplicate key in object 'dbo.eventLog'. The duplicate key value is (1001).

The statement has been terminated.

Now the rest of the code below takes some explaining.

The first block of code will do an insert. But FOR TESTING I've put in a

wait state so that you can flip over to another window and insert a

DIFFERENT row while the first is waiting. This is to simulate what might

happen if the next ID has been assigned but before the insert completes

a second insert attempt trying to use the same value.

So this code will INSERT a new row and "hold" that value even if another

insert takes place in the interim. It does this by inserting only the sessioID

and then coming back and doing an update even if another row has been

inserted behind it. To see how this works, before running the code, create another

window with the second insert command. Then run the first script and while it's waiting

flip over and insert a record (or two or more) from the second page. After the wait state

expires (I made it 15 secs for testing) you can go back to the first page and

see that the first row has been updated properly. The only minor anomaly I

see is that the timestamp column value may not be in order.

WINDOW 1:

DECLARE

@NextID INT

BEGIN

-- get the next id

;WITH NextID(sessionID)

AS

(

SELECT

MAX(sessionID)+1

FROM dbo.eventLog

)

INSERT INTO dbo.EventLog(sessionID)

SELECT

sessionID

FROM

NextID AS n

SELECT TOP(1)

@NextID = SessionID

FROM

dbo.EventLog

WHERE

logText IS NULL

ORDER BY

logText

--for testing

SELECT @NextID AS NextID

/* Put in a wait state for testing only */

WAITFOR DELAY '000:00:15' -- 15 sec delay for TESTING ONLY!

--WHILE WAITING, GO TO SEPARATE WINDOW AND RUN AN INSERT--

UPDATE dbo.eventLog

SET

logText = 'Ronald Reagan'

,created = GETDATE()

WHERE

sessionID = @NextID

END

SELECT

*

FROM

dbo.eventLog

WINDOW 2:

/* Simulate a second simultaneous insert */

/* while waiting for the first insert */

DECLARE @NewID INT

SELECT

@NewID = MAX(sessionID)+1

FROM

dbo.eventLog

;WITH cteNewData (sessionID,logText,created)

AS

(

SELECT @NewID,'Bill Clinton','2013-01-31'

)

INSERT INTO dbo.eventLog

(sessionID,logText,created)

SELECT

nd.sessionID

,nd.logText

,nd.created

FROM

cteNewData AS nd

SELECT

*

FROM

dbo.eventLog

Dwain Camps SSC Guru Points: 86908 More actions · Answer 2

Somebody's probably going to shoot me for suggesting it but (utilizing Steve's wonderful DDL and sample data), perhaps something like this:

CREATE TABLE #eventLog

(

sessionID INT NOT NULL

,logID TIMESTAMP NOT NULL

,logText VARCHAR(1000) NULL

,created DATETIME NULL

,PRIMARY KEY CLUSTERED (sessionID, logID)

,UNIQUE (sessionID)

)

;WITH cteSampleData (sessionID,logText,created)

AS

(

SELECT 1001,'George Washington','2013-03-31' UNION ALL

SELECT 1002,'John Adams','2013-02-28' UNION ALL

SELECT 1003,'Thomas Jefferson','2013-02-15' UNION ALL

SELECT 1004,'James Madison','2013-02-01' UNION ALL

SELECT 1005,'James Monroe','2013-01-31' UNION ALL

SELECT 1006,'John Q Adams','2013-01-22' UNION ALL

SELECT 1007,'Andrew Jackson','2013-01-13'

)

INSERT INTO #eventLog

(sessionID,logText,created)

SELECT

sessionID

,logText

,created

FROM

cteSampleData

SELECT

*

FROM

#eventLog;

WITH LastSessionID AS (

SELECT LastSessionID=MAX(sessionID) FROM #eventLog WITH(UPDLOCK))

INSERT INTO #eventLog (sessionID,logText,created)

SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) + LastSessionID

,logText,created

FROM (

-- Your newest inserts go here

SELECT 'Babe Ruth', '2013-01-20'

UNION ALL SELECT 'Ty Cobb', '2013-01-21'

) a (logText,created)

CROSS APPLY LastSessionID b

SELECT *

FROM #eventLog

GO

DROP TABLE #eventLog

Table name changed to a temp table to protect the innocent.

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

Dwain Camps SSC Guru Points: 86908 More actions · Answer 3

Steven - BTW.

If you do it the way you suggested, which is to break it into multiple SQL statements, you should consider 2 things:

1. Wrap the SQL statements in a TRANSACTION.

2. Add the UPDLOCK hint on the SELECT MAX (as I did in my example)

Then in your 2 windows test case, you don't need to introduce the WAIT. You can simply execute the first few statements just prior to the INSERT, switch windows and execute the other SQL (which will hold until you go back to the prior window and execute the final statements up to COMMIT TRANSACTION.

UPDLOCK holds the locked table until the transaction completes. In my case, since it wasn't in a transaction the lock is freed once the single statement completes.

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

Steven Willis SSCrazy Eights Points: 9893 More actions · Answer 4

dwain.c (7/12/2013)
Steven - BTW.
If you do it the way you suggested, which is to break it into multiple SQL statements, you should consider 2 things:
1. Wrap the SQL statements in a TRANSACTION.
2. Add the UPDLOCK hint on the SELECT MAX (as I did in my example)
Then in your 2 windows test case, you don't need to introduce the WAIT. You can simply execute the first few statements just prior to the INSERT, switch windows and execute the other SQL (which will hold until you go back to the prior window and execute the final statements up to COMMIT TRANSACTION.
UPDLOCK holds the locked table until the transaction completes. In my case, since it wasn't in a transaction the lock is freed once the single statement completes.

Yeah, in an actual case I'd have added a TRAN block with UPDLOCK and I did that at first. But I wanted instant gratification watching the WINDOW 2 rows get inserted then flip back to WINDOW 1 and see the first row get updated after the delay. I know that it was a contrived simulation.

If this is representing a high-volume situation I'm not sure what effect the UPDLOCK might have on concurrency. And I'm not sure my method would handle more than two simultaneous insert attempts or what would happen if a duplicate PK still somehow caused an error. I had in an early version a TRY/CATCH block that would keep trying until a valid PK sessionID could be inserted, but it was late and I was really guessing at what the OPs actual requirement was. Seems to me that some sort of additional identity column would be a much easier and more reliable fix.

rice.tx Mr or Mrs. 500 Points: 559 More actions · Answer 5

Thank you both for your responses. It seems the answer will involve explicit transaction and explicit locks. I was hoping to avoid that. I know there is a proper time to use explicit transactions, but I have also seen a lot of problems introduced (deadlocks, and performance issues). Similarly with optimizer hints... if I have to resort to them, I suspect there is some other design problem.

With locking hints, I've spent time studying the documentation, and reading blogs, forums, etc, and admit I lack confidence that I (and many others) understand completely what the effect of each hint is. Which again leads me to avoid use when I can.

The great examples you provided use UPDLOCK. It's not clear to me this would solve the problem because this doesn't prevent other transactions from reading the rows to calculate the next id (which could result in two transactions trying to use the same ID). You may be right, but it seems to me you would need the XLOCK with a PAGLOCK to prevent other transactions from reading the rows until you can insert the desired row. But now it seems I'm really trying too hard.

And see this for example of how documented behavior isn't always complete:

http://support.microsoft.com/kb/324417

I plan on re-writing the table to use an Identity column, and avoid all the complexity.

Thanks again, and please comment on the lock discussion if you have some clarification!

rice.tx Mr or Mrs. 500 Points: 559 More actions · Answer 6

Update: I tried several different combinations of locks, including (updlock) (xlock paglock) (holdlock). The only lock that prevented a second transaction from potentially inserting a key violation was (tablock). Please let me know if anyone sees something I'm missing.

( BTW, how do you create those windows of code in the forum?)

To create the table:

CREATE TABLE [dbo].[EventLog](

[sessionID] [int] NOT NULL,

[logID] [int] NOT NULL,

[logText] [varchar](50) NULL,

[created] [datetime] NULL,

CONSTRAINT [PK_EventLog] PRIMARY KEY CLUSTERED

(

[sessionID] ASC,

[logID] ASC

)

-- in window 1 -----------------------------------

declare

@nextLogID int,

@sessionID int,

@logText varchar(50)

truncate table eventLog

select @sessionID = 1, @logText = 'this is a note #1'

begin tran

select @nextLogID = isNull(max(logID),0)+1

from eventLog with (xlock tablock) -- modify the lock top to try variations.

where sessionID = @sessionID

waitfor delay '00:00:05'

insert into eventLog (sessionID, logID, logText)

values (@sessionID, @nextLogID, @logText)

commit tran

select * from eventLog

-- in window 2, run while window 1 is still waiting --------------

declare

@nextLogID int,

@sessionID int,

@logText varchar(50)

select @sessionID = 1, @logText = 'this is a note #2'

begin tran

select @nextLogID = isNull(max(logID),0)+1

from eventLog with (xlock tablock)

where sessionID = @sessionID

insert into eventLog (sessionID, logID, logText)

values (@sessionID, @nextLogID, @logText)

commit tran

select * from eventLog

Dwain Camps SSC Guru Points: 86908 More actions · Answer 7

rice.tx (7/14/2013)
Thank you both for your responses. It seems the answer will involve explicit transaction and explicit locks. I was hoping to avoid that. I know there is a proper time to use explicit transactions, but I have also seen a lot of problems introduced (deadlocks, and performance issues). Similarly with optimizer hints... if I have to resort to them, I suspect there is some other design problem.
...
The great examples you provided use UPDLOCK. It's not clear to me this would solve the problem because this doesn't prevent other transactions from reading the rows to calculate the next id (which could result in two transactions trying to use the same ID). You may be right, but it seems to me you would need the XLOCK with a PAGLOCK to prevent other transactions from reading the rows until you can insert the desired row. But now it seems I'm really trying too hard.
...

If you want a 100% certainty that there will not be a key violation, you must use an IDENTITY column (but presumably you know that and have some reason why you think you cannot).

While there may be edge cases where the locks don't work, I can say my suggestion to use UPDLOCK was more than theoretical. We've used it quite successfully in a very similar case and from what I can tell from the documentation (and actually trying it with two SSMS sessions) that it does exactly what it was designed to do. And the system that uses this is a relatively high volume one.

I still recommend that since it is possible to combine the SELECT MAX and the INSERT, that you do so in order to avoid latency between queries which might contribute to deadlock.

And the only reason I've seen where a TRANSACTION might cause unwarranted deadlocks is the case where you've got some inefficient query running within the TRANSACTION (in which case try to improve that query), or you're not following the best practice which is to minimize the amount of work being done within the TRANSACTION.

For a somewhat contrived example, I've seen TRANSACTIONs that perform SELECTs within their scope that have nothing to do with the data being updated. Those should be removed from the TRANSACTION to minimize the overall processing time.

Good luck!

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St