Trying to avoid a possible race condition

Question

Post reply

Trying to avoid a possible race condition

Kevin Rodgers

Old Hand

Points: 387
More actions
March 5, 2007 at 1:47 pm

#119082

I have a table that is being filled roughly every 10 seconds via a T-SQL stored procedure.
While this is going on, several (likely around 10) small programs will be accessing this table looking to run the following query: "SELECT TOP [n] FROM "
The issue here is that any record selected from this table by one of these small programs cannot be selected by any of the other versions of that program that are running concurrently. Therefore, I need a way to DELETE these records from the table before they can be accessed again. My thought here is to perform these SELECT and DELETE queries within another T-SQL stored procedure, but I'm not sure if that will do the trick. As of now we're running SQL 2000 under standard isolation levels.
If anyone can reply with an idea or some info I would really appreciate it.

Viewing 15 posts - 1 through 15 (of 20 total)

You must be logged in to reply to this topic. Login to reply

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 1

This may need some work on the transaction part but here's the global idea :

begin transaction

update bt set spid = @@spid FROM dbo.baseTable BT inner join (SELECT TOP 10 id from dbo.BaseTable BT2 WHERE spid is null)

COMMIT transaction

--process the records

Select * from dbo.baseTable where spid = @@spid

--after the process is over

DELETE from dbo.BaseTable where spid = @@spid

Jeff Moden SSC Guru Points: 1004659 More actions · Answer 2

Nicely done, Remi...

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 3

Thanx Jeff. The only part I'm not certain about is deadlocks. Can you confirm that it is safe on that end?

Jeff Moden SSC Guru Points: 1004659 More actions · Answer 4

About the only thing I'd change is I'd remove the declared transaction.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Ken McKelvey SSCoach Points: 18975 More actions · Answer 5

I think you need to put an UPDLOCK in the derived table in order to guarantee that only one transaction can read the rows. Also, there may as well be a READPAST hint to increase concurrency.

UPDATE BT

SET spid = @@spid

FROM dbo.BaseTable BT

JOIN (

SELECT TOP 10 [id]

FROM dbo.BaseTable BT2 WITH (READPAST UPDLOCK)

WHERE spid IS NULL

) D

ON BT.[id] = D.[id]

As the only process seems to be a SELECT, I would do something, like the following, so there is only one commit:

DECLARE @t TABLE

(

[id] int NOT NULL PRIMARY KEY

)

BEGIN TRAN

INSERT INTO @t

SELECT TOP 10 [id]

FROM dbo.BaseTable WITH (READPAST UPDLOCK)

SELECT *

FROM dbo.BaseTable

WHERE EXISTS (

SELECT *

FROM @t T

WHERE dbo.BaseTable.[id] = T.[id]

)

DELETE dbo.BaseTable

WHERE EXISTS (

SELECT *

FROM @t T

WHERE dbo.BaseTable.[id] = T.[id]

)

COMMIT

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 6

As the only process seems to be a SELECT, I would do something, like the following, so there is only one commit:

I would not assume that if I were you. My code was just a sample to demonstrate the global idea. I have no clue what type of processing needs to be done with those records after they are tagged.

Kevin Rodgers Old Hand Points: 387 More actions · Answer 7

Thanks so much for the help so far!

For a bit of clarification, the small programs that will be accessing this table are going to be creating text files from the data it pulls for another program to use. Once a record is pulled for this use (ie. once a "tagged" record has been pulled), we cannot have it be pulled again.

Also, a tagged record's continued existence in the table serves no purpose other than to prevent new records from being added to the table, which is why we need to DELETE them after they are pulled.

With that in mind, is Ken's solution viable?

Jeff Moden SSC Guru Points: 1004659 More actions · Answer 8

I prefer Remi's solution... shorter transaction... fewer actions within the transaction... maybe fewer deadlocks.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 9

I'll take Jeff's word for it. I don't have enough experience in high transaction environement to comment on deadlocks.

The only change I would make is add a second sp.

sp 1 tags and fetches 10 records.

sp 2 deletes the records.

Just to be 100% safe. I think I would use an output parameter in sp1 which would be set to the spid.

Then add an input param to sp2 which would be set to the output parameter value of sp 1.

Then in the program :

while (whatever condition you need)

--fetch records

call sp1 @OutputParam

--@OutputParam is now set to the current spid

process records

--delete record

call sp2 @OutputParam --still set to the "old" spid, this will be used to delete the tagged records.

wend

That way you can always recover if for some reason you get disconnected from the server between the 2 calls. But maybe I'm just too paranoid .

Ken McKelvey SSCoach Points: 18975 More actions · Answer 10

In this situation, Remi's outline solution is best.

You should however use an UPDLOCK on the select in the derived table. Ensuring that only one of the processes can read the rows with an UPDLOCK means deadlocks are avoided. (See the UPDLOCK section in the following: http://www.mssqlcity.com/Articles/General/sql2000_locking.htm ) If accecptable, a READPAST hint will help with concurrency.

Also, if you are using pooled connections in .NET, you may want to generate your own batchID rather than relying on @@SPID.

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 11

That's also why I threw in the output parameters in there. If for any bizarre reason you get disconnected, then you can still continue the batch. But then again I will go out on a limb and assume that the same spid could be used for 2 consecutive, but distinct, connections.

So to further bulletproof this, you could switch the spid column to a uniqueidentifier. Or you could add a previous check to see if the spid already exists in the table before selecting any new rows. But this option would cost you 1 more query each run. While the other option would be slightly more costly on the io because the tag column would be char(36) instead of [small]int and that you need to index this column.

I think I'd go with the uniqueidentifier just to make sure I have no surprise ever.

Ken McKelvey SSCoach Points: 18975 More actions · Answer 12

A BatchID could also be generated from a dummy IDENTITY table. Something like.

CREATE TABLE dbo.BatchIDs

(

BatchId int IDENTITY NOT NULL

)

GO

DECLARE @BatchID int

BEGIN TRAN

INSERT INTO dbo.BatchIDs DEFAULT VALUES

ROLLBACK

SELECT @BatchID = SCOPE_IDENTITY()

-- SELECT @BatchID

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 13

Thanks for this interesting concept. I'm not sure I'd use it over the uniqueidentifier.

I like to have as little transactions as possible but I'll be sure this keep this in mind in the future.

Kevin Rodgers Old Hand Points: 387 More actions · Answer 14

To Remi: Is there any reason for putting the DELETE in a separate sp? It seems that if I were to have the UPDATE, SELECT and DELETE queries all within the same sp, I wouldn't have to worry about a potential change in spid when I go to DELETE.

To Ken: The SCOPE_IDENTITY() function seems like it would be great for this application, but the documentation I saw for it indicates that it's for SQL 2005 (I'm still running on 2000 here ).