March 5, 2007 at 1:47 pm
I have a table that is being filled roughly every 10 seconds via a T-SQL stored procedure.
While this is going on, several (likely around 10) small programs will be accessing this table looking to run the following query: "SELECT TOP [n] FROM "
The issue here is that any record selected from this table by one of these small programs cannot be selected by any of the other versions of that program that are running concurrently. Therefore, I need a way to DELETE these records from the table before they can be accessed again. My thought here is to perform these SELECT and DELETE queries within another T-SQL stored procedure, but I'm not sure if that will do the trick. As of now we're running SQL 2000 under standard isolation levels.
If anyone can reply with an idea or some info I would really appreciate it.
March 5, 2007 at 2:26 pm
This may need some work on the transaction part but here's the global idea :
begin transaction
update bt set spid = @@spid FROM dbo.baseTable BT inner join (SELECT TOP 10 id from dbo.BaseTable BT2 WHERE spid is null)
COMMIT transaction
--process the records
Select * from dbo.baseTable where spid = @@spid
--after the process is over
DELETE from dbo.BaseTable where spid = @@spid
March 5, 2007 at 11:15 pm
Nicely done, Remi...
--Jeff Moden
Change is inevitable... Change for the better is not.
March 6, 2007 at 5:23 am
Thanx Jeff. The only part I'm not certain about is deadlocks. Can you confirm that it is safe on that end?
March 6, 2007 at 6:08 am
About the only thing I'd change is I'd remove the declared transaction.
--Jeff Moden
Change is inevitable... Change for the better is not.
March 6, 2007 at 7:16 am
I think you need to put an UPDLOCK in the derived table in order to guarantee that only one transaction can read the rows. Also, there may as well be a READPAST hint to increase concurrency.
UPDATE BT
SET spid = @@spid
FROM dbo.BaseTable BT
JOIN (
SELECT TOP 10 [id]
FROM dbo.BaseTable BT2 WITH (READPAST UPDLOCK)
WHERE spid IS NULL
) D
ON BT.[id] = D.[id]
As the only process seems to be a SELECT, I would do something, like the following, so there is only one commit:
DECLARE @t TABLE
(
[id] int NOT NULL PRIMARY KEY
)
BEGIN TRAN
INSERT INTO @t
SELECT TOP 10 [id]
FROM dbo.BaseTable WITH (READPAST UPDLOCK)
SELECT *
FROM dbo.BaseTable
WHERE EXISTS (
SELECT *
FROM @t T
WHERE dbo.BaseTable.[id] = T.[id]
)
DELETE dbo.BaseTable
WHERE EXISTS (
SELECT *
FROM @t T
WHERE dbo.BaseTable.[id] = T.[id]
)
COMMIT
March 6, 2007 at 9:13 am
As the only process seems to be a SELECT, I would do something, like the following, so there is only one commit:
I would not assume that if I were you. My code was just a sample to demonstrate the global idea. I have no clue what type of processing needs to be done with those records after they are tagged.
March 6, 2007 at 1:38 pm
Thanks so much for the help so far!
For a bit of clarification, the small programs that will be accessing this table are going to be creating text files from the data it pulls for another program to use. Once a record is pulled for this use (ie. once a "tagged" record has been pulled), we cannot have it be pulled again.
Also, a tagged record's continued existence in the table serves no purpose other than to prevent new records from being added to the table, which is why we need to DELETE them after they are pulled.
With that in mind, is Ken's solution viable?
March 6, 2007 at 4:44 pm
I prefer Remi's solution... shorter transaction... fewer actions within the transaction... maybe fewer deadlocks.
--Jeff Moden
Change is inevitable... Change for the better is not.
March 6, 2007 at 5:31 pm
I'll take Jeff's word for it. I don't have enough experience in high transaction environement to comment on deadlocks.
The only change I would make is add a second sp.
sp 1 tags and fetches 10 records.
sp 2 deletes the records.
Just to be 100% safe. I think I would use an output parameter in sp1 which would be set to the spid.
Then add an input param to sp2 which would be set to the output parameter value of sp 1.
Then in the program :
while (whatever condition you need)
--fetch records
call sp1 @OutputParam
--@OutputParam is now set to the current spid
process records
--delete record
call sp2 @OutputParam --still set to the "old" spid, this will be used to delete the tagged records.
wend
That way you can always recover if for some reason you get disconnected from the server between the 2 calls. But maybe I'm just too paranoid .
March 7, 2007 at 4:58 am
In this situation, Remi's outline solution is best.
You should however use an UPDLOCK on the select in the derived table. Ensuring that only one of the processes can read the rows with an UPDLOCK means deadlocks are avoided. (See the UPDLOCK section in the following: http://www.mssqlcity.com/Articles/General/sql2000_locking.htm ) If accecptable, a READPAST hint will help with concurrency.
Also, if you are using pooled connections in .NET, you may want to generate your own batchID rather than relying on @@SPID.
March 7, 2007 at 7:16 am
That's also why I threw in the output parameters in there. If for any bizarre reason you get disconnected, then you can still continue the batch. But then again I will go out on a limb and assume that the same spid could be used for 2 consecutive, but distinct, connections.
So to further bulletproof this, you could switch the spid column to a uniqueidentifier. Or you could add a previous check to see if the spid already exists in the table before selecting any new rows. But this option would cost you 1 more query each run. While the other option would be slightly more costly on the io because the tag column would be char(36) instead of [small]int and that you need to index this column.
I think I'd go with the uniqueidentifier just to make sure I have no surprise ever.
March 7, 2007 at 7:36 am
A BatchID could also be generated from a dummy IDENTITY table. Something like.
CREATE TABLE dbo.BatchIDs
(
BatchId int IDENTITY NOT NULL
)
GO
DECLARE @BatchID int
BEGIN TRAN
INSERT INTO dbo.BatchIDs DEFAULT VALUES
ROLLBACK
SELECT @BatchID = SCOPE_IDENTITY()
-- SELECT @BatchID
March 7, 2007 at 7:47 am
Thanks for this interesting concept. I'm not sure I'd use it over the uniqueidentifier.
I like to have as little transactions as possible but I'll be sure this keep this in mind in the future.
March 7, 2007 at 11:46 am
To Remi: Is there any reason for putting the DELETE in a separate sp? It seems that if I were to have the UPDATE, SELECT and DELETE queries all within the same sp, I wouldn't have to worry about a potential change in spid when I go to DELETE.
To Ken: The SCOPE_IDENTITY() function seems like it would be great for this application, but the documentation I saw for it indicates that it's for SQL 2005 (I'm still running on 2000 here ).
Viewing 15 posts - 1 through 15 (of 20 total)
You must be logged in to reply to this topic. Login to reply