How To: Avoid Deadlocks

Question

How To: Avoid Deadlocks

bigdave

SSC Eights!

Points: 853
More actions
November 3, 2002 at 5:23 pm

#80056

I have a complex sp that is responsible for selecting, locking and returning the next call in a call centre. It is tricky because I have to ensure that the same record is not returned to multiple users.
It is quite a long sp that attempts to retrieve a record using a series of select statements, one after the other, until a suitable record is found. It then attempts an update on the record (inserting the sessionid of the user to essentially lock the record).
My problem is that I have deadlocks occuring and cant figure out a way to avoid them. They are getting very serious at some sites.
I have tried using updlock in a select, but that doesnt seem to help.
I know that I could try to think of an alternative approach for selecting the next call (such as building and mantaining tables for my pools of calls using triggers) . . but this would pose other problems (besides the development time!).
I have tried various ways to use transactions inside the sp - but they generally result in the same call going to multiple users.
Any suggestions would be hugely appreciated. Here is the sp . . . .
create Procedure usr_GetNextCall @SessionID int = 0
AS
declare @MaxCalls int
declare @AnswerID int
declare @user-id int
declare @i int
set nocount on
set ansi_warnings off
-- Only return 1 row
set RowCount 1
-- Store User ID
select @user-id = (Select UserID from CurrentUsers Where SessionID = @SessionID)
--
-- Use a transaction
--Begin Tran
SET TRANSACTION ISOLATION LEVEL READ COMMITTED
--
-- Set A Default AnswerID
select @AnswerID = 0
select @i = 0
--
-- 1. Do Operator Specific Call-Backs
Select @AnswerID =
(
Select Top 1 AnswerID
from Answers_Eligible
--With(updlock)
Where SaveUserID = @user-id
And (not CallBackStart is null)
And (not CallBackEnd is null)
And CallBackStart < GetDate()
And CallBackEnd > GetDate()
And (WhyIncompleteActionID = 3 or WhyIncompleteActionID = 5)
Order By CallBackStart
)
-- Abort if row found
If @AnswerID>0 Goto LockRow
--
-- 2. Do Operator Generic Call-Backs
Select @AnswerID =
(
Select Top 1 AnswerID
from Answers_Eligible
--With(updlock)
Where (not CallBackStart is null)
And (not CallBackEnd is null)
And CallBackStart < GetDate()
And CallBackEnd > GetDate()
And (WhyIncompleteActionID = 4)
Order By CallBackStart
)
-- Abort if row found
If @AnswerID>0 Goto LockRow
--
-- 3. Do Operator Specific Then Generic Call-Backs
Select @AnswerID =
(
Select Top 1 AnswerID
from Answers_Eligible
--With(updlock)
Where (not CallBackStart is null)
And (not CallBackEnd is null)
And GetDate() > DateAdd(mi, CallBackExpiry, CallBackStart)
And CallBackEnd > GetDate()
And (WhyIncompleteActionID = 5)
Order By CallBackStart
)
-- Abort if row found
If @AnswerID>0 Goto LockRow
--
-- 4. Do Forced Records (Top of Pool)
Select @AnswerID =
(
Select Top 1 AnswerID
from Answers_Eligible
--With(updlock)
Where (WhyIncompleteActionID = 6)
Order By Priority, SaveDateTime
)
-- Abort if row found
If @AnswerID>0 Goto LockRow
--
-- Start any retries here
RetryStartHere:
-- count loop iterations
Select @i = @i + 1
--
-- 5. Do Remainder of Pool (including records forced into the bottom of the pool)
Select @AnswerID =
(
Select Top 1 AnswerID
from Answers_Eligible
--With(updlock)
WHERE (WhyIncompleteActionID = 2 OR WhyIncompleteActionID = 8 OR WhyIncompleteActionID = 7 OR WhyIncompleteActionID IS NULL)
AND (MaxCalls > CallAttempt)
AND (DATEADD(mi, DelayTime, SaveDateTime) < GETDATE())
AND (CallBackStart IS NULL)
AND (CallBackEnd IS NULL)
OR (WhyIncompleteActionID = 2 OR WhyIncompleteActionID = 8 OR WhyIncompleteActionID = 7 OR WhyIncompleteActionID IS NULL)
AND (MaxCalls > CallAttempt)
AND (CallBackStart IS NULL)
AND (CallBackEnd IS NULL)
AND (SaveDateTime IS NULL)
Order By Priority, DateAdd(mi, DelayTime, SaveDateTime), AnswerID
)
-- Abort if row found
If @AnswerID>0
begin
Goto LockRow
end
Else
begin
-- must be no records available
goto NoRecords
end
--
-- Lock The Record and Return It
LockRow:
-- Lock the record
--BEGIN TRAN
update Answers set InUse = @SessionID Where AnswerID = @AnswerID And InUse = 0
--COMMIT TRAN
-- check for no update
if @@ROWCOUNT = 0
begin
-- log looping
--BEGIN TRAN
Insert into LoopLog (SessionID, AnswerID, Counter, HitDateTime)
Select @SessionID, @AnswerID, @i, GetDate()
--COMMIT TRAN
-- go to start of loop
goto RetryStartHere
end
-- Return the result
NoRecords:
select @AnswerID
Done:
--Commit Tran
--

Viewing 13 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply

Andy Warren SSC Guru Points: 119912 More actions · Answer 1

One way would be increase the transaction level to serialized - though that will slow things down. Another is to just trap for deadlocks and have the app issue a retry.

Andy

http://www.sqlservercentral.com/columnists/awarren/

Andy
Connect with me on LinkedIn

dj_meier SSCrazy Points: 2732 More actions · Answer 2

I have had success using the TABLOCKX locking hint. Although it goes against what seems like common logic, it actually increased performance by about 10x.

I especially notice improvements when subqueries using the same table are involved. Here is a code snippet, although not the exact same as yours, you can get the idea:

UPDATE Queue

SET Status = 'H',

Start_Time = GETDATE(),

User_Login = @Uid

FROM Queue WITH (TABLOCKX INDEX=PK_Queue)

WHERE Client_Key = (SELECT TOP 1 Client_Key

FROM Queue WITH (TABLOCKX INDEX=queue1)

WHERE Project_ID = @Project

AND Status = 'Q'

AND ISNULL(Conf_Lev,0) < @Threshold_A

AND ISNULL(Conf_Lev,0) > @Threshold_B)

Another technique you may consider is trying to UPDATE at each condition instead of selecting the ID. Then check the @@rowcount after each UPDATE to see if you got something. By using this method, you lock the row (or table if using TABLOCKX) as you check the conditions.

-Dan

Edited by - dj_meier on 11/05/2002 10:17:34 AM

-Dan

bigdave SSC Eights! Points: 853 More actions · Answer 3

Thanks for your suggestions Dan. Andy's suggestions are not really constructive... slowing down the sp is not an option (we already have timeouts occurring), and of course the app already resubmites on deadlock, but this is not solving the issue.

I had tried to issue an update immediately (as per your recommendation) . . but found that multiple users received the same record (oops!). I guess I need to lock the row in question when updating. I thought that the default isolation level of read committed would lock the row and prevent multiple updates occurring at once (I am begining to think that I will never uderstand locking in SQL Server??).

I will try to issue an update with a row locking hint and see how goes.

Thanks again.

Dave

dj_meier SSCrazy Points: 2732 More actions · Answer 4

The field that is being updated needs to be checked in the WHERE clause. For example, if you are updating the InUse field, then it must say in the UPDATE, "WHERE InUse = 0", or something like that. (I'm assuming AnswerID is unique, and the SessionID value stays set after the user is done w/ the record)

What about something like this:

-- First Condition...

update Answers

set InUse = @SessionID

Where AnswerID =

(

Select Top 1 AnswerID

from Answers_Eligible WITH (TABLOCKX)

Where SaveUserID = @user-id

And (not CallBackStart is null)

And (not CallBackEnd is null)

And CallBackStart < GetDate()

And CallBackEnd > GetDate()

And (WhyIncompleteActionID = 3 or WhyIncompleteActionID = 5)

Order By CallBackStart

)

And InUse = 0

-- Abort if row found

IF @@ROWCOUNT = 0

BEGIN

-- Second Condition

-- (Your code here)...

IF @@ROWCOUNT = 0

BEGIN

-- Third condition here

END

-Dan

bigdave SSC Eights! Points: 853 More actions · Answer 5

Thanks Dan. I have not had enough time to implement and test this suggestion . . but I will soon.

I did not realise that to lock a col it should be in the 'where' clause of the update. Can anyone else please confirm this behaviour?

Scorpion_66 SSCertifiable Points: 7891 More actions · Answer 6

It does not have to be in the where clause to lock the column. That was not the reason for suggesting that course, I don't believe.

I believe it was suggested to prevent more than a single select from grabbing the data, as one that was already being used would have INUSE <> 0. And its an excellent suggestion, in my opinion......for what thats worth.....

guarddata SSCertifiable Points: 6583 More actions · Answer 7

We have found it more convenient to separate the SELECT from the UPDATE in order to know if the failure was due to contention or end of queue. Also - ROWLOCK is probably sufficient. Something like:

SELECT @GoodRow = 0, @AnswerID = NULL

WHILE @GoodRow = 0

BEGIN

SELECT @AnswerID = Top 1 AnswerID

FROM Answers_Eligible WITH(ROWLOCK)

WHERE <Condition>

IF @AnswerID IS NULL

SELECT @GoodRow = 2

ELSE

BEGIN

UPDATE Answers_Eligible SET InUse = @SessionID

WHERE AnswerID = @AnswerID AND InUse = 0

IF @@ROWCOUNT = 1

SELECT @GoodRow = 1

END

IF @GoodRow

ELSE

Scorpion_66 SSCertifiable Points: 7891 More actions · Answer 8

With the select and the update in seperate statements, I believe its possible for two occurrances of this to grab the same row, and the last one to update the record would win. If its done this way, I would think that you would need to wrap the select and the update in a transaction to prevent another occurrance of it from grabbing the same row prior to the update. I think that it would be hard to get two occurrances to get the same row that way, but it would be possible. The read committed transaction level would not prevent two users from reading the same row until the update was initialized. A transaction would lock the row read until the batch of statements finished, where it is obvious that the rowlock does not hold the row locked in between the select and update as the row could not be updated if it was.

If it is to be in one statement, the update, and the identity passed back, you do not have that issue. If it to be in two statements, wrap it in a transaction to prevent this behavior.

jensk2 SSC Enthusiast Points: 159 More actions · Answer 9

quote:
I have a complex sp that is responsible for selecting, locking and returning the next call in a call centre. It is tricky because I have to ensure that the same record is not returned to multiple users.
It is quite a long sp that attempts to retrieve a record using a series of select statements, one after the other, until a suitable record is found. It then attempts an update on the record (inserting the sessionid of the user to essentially lock the record).
My problem is that I have deadlocks occuring and cant figure out a way to avoid them. They are getting very serious at some sites.
I have tried using updlock in a select, but that doesnt seem to help.
I know that I could try to think of an alternative approach for selecting the next call (such as building and mantaining tables for my pools of calls using triggers) . . but this would pose other problems (besides the development time!).
I have tried various ways to use transactions inside the sp - but they generally result in the same call going to multiple users.
Any suggestions would be hugely appreciated. Here is the sp . . . .
create Procedure usr_GetNextCall @SessionID int = 0
AS
declare @MaxCalls int
declare @AnswerID int
declare @user-id int
declare @i int
set nocount on
set ansi_warnings off
-- Only return 1 row
set RowCount 1
-- Store User ID
select @user-id = (Select UserID from CurrentUsers Where SessionID = @SessionID)
--
-- Use a transaction
--Begin Tran
SET TRANSACTION ISOLATION LEVEL READ COMMITTED
--
-- Set A Default AnswerID
select @AnswerID = 0
select @i = 0
--
-- 1. Do Operator Specific Call-Backs
Select @AnswerID =
(
Select Top 1 AnswerID
from Answers_Eligible
--With(updlock)
Where SaveUserID = @user-id
And (not CallBackStart is null)
And (not CallBackEnd is null)
And CallBackStart < GetDate()
And CallBackEnd > GetDate()
And (WhyIncompleteActionID = 3 or WhyIncompleteActionID = 5)
Order By CallBackStart
)
-- Abort if row found
If @AnswerID>0 Goto LockRow
--
-- 2. Do Operator Generic Call-Backs
Select @AnswerID =
(
Select Top 1 AnswerID
from Answers_Eligible
--With(updlock)
Where (not CallBackStart is null)
And (not CallBackEnd is null)
And CallBackStart < GetDate()
And CallBackEnd > GetDate()
And (WhyIncompleteActionID = 4)
Order By CallBackStart
)
-- Abort if row found
If @AnswerID>0 Goto LockRow
--
-- 3. Do Operator Specific Then Generic Call-Backs
Select @AnswerID =
(
Select Top 1 AnswerID
from Answers_Eligible
--With(updlock)
Where (not CallBackStart is null)
And (not CallBackEnd is null)
And GetDate() > DateAdd(mi, CallBackExpiry, CallBackStart)
And CallBackEnd > GetDate()
And (WhyIncompleteActionID = 5)
Order By CallBackStart
)
-- Abort if row found
If @AnswerID>0 Goto LockRow
--
-- 4. Do Forced Records (Top of Pool)
Select @AnswerID =
(
Select Top 1 AnswerID
from Answers_Eligible
--With(updlock)
Where (WhyIncompleteActionID = 6)
Order By Priority, SaveDateTime
)
-- Abort if row found
If @AnswerID>0 Goto LockRow
--
-- Start any retries here
RetryStartHere:
-- count loop iterations
Select @i = @i + 1
--
-- 5. Do Remainder of Pool (including records forced into the bottom of the pool)
Select @AnswerID =
(
Select Top 1 AnswerID
from Answers_Eligible
--With(updlock)
WHERE (WhyIncompleteActionID = 2 OR WhyIncompleteActionID = 8 OR WhyIncompleteActionID = 7 OR WhyIncompleteActionID IS NULL)
AND (MaxCalls > CallAttempt)
AND (DATEADD(mi, DelayTime, SaveDateTime) < GETDATE())
AND (CallBackStart IS NULL)
AND (CallBackEnd IS NULL)
OR (WhyIncompleteActionID = 2 OR WhyIncompleteActionID = 8 OR WhyIncompleteActionID = 7 OR WhyIncompleteActionID IS NULL)
AND (MaxCalls > CallAttempt)
AND (CallBackStart IS NULL)
AND (CallBackEnd IS NULL)
AND (SaveDateTime IS NULL)
Order By Priority, DateAdd(mi, DelayTime, SaveDateTime), AnswerID
)
-- Abort if row found
If @AnswerID>0
begin
Goto LockRow
end
Else
begin
-- must be no records available
goto NoRecords
end
--
-- Lock The Record and Return It
LockRow:
-- Lock the record
--BEGIN TRAN
update Answers set InUse = @SessionID Where AnswerID = @AnswerID And InUse = 0
--COMMIT TRAN
-- check for no update
if @@ROWCOUNT = 0
begin
-- log looping
--BEGIN TRAN
Insert into LoopLog (SessionID, AnswerID, Counter, HitDateTime)
Select @SessionID, @AnswerID, @i, GetDate()
--COMMIT TRAN
-- go to start of loop
goto RetryStartHere
end
-- Return the result
NoRecords:
select @AnswerID
Done:
--Commit Tran
--

jensk2 SSC Enthusiast Points: 159 More actions · Answer 10

I have a complex sp that is responsible for selecting, locking and returning the next call in a call centre. It is tricky because I have to ensure that the same record is not returned to multiple users.

I would go for updating the row to process immediately, instead of first selecting, then updating, which would actually require BEGIN TRAN + UPDLOC + COMMIT TRAN to ensure that multiple users do not get the same row.

If you replace all your selects with stuuf like:

declare @answ int

update callbacks set

@answ = answid,

inuse = 1

where answid in

(select min(answid) from

callbacks

where inuse = 0

-- and additional criterias

-- and your additional criterias

)

print @answ

go

or:

declare @answ int

update callbacks set

@answ = answid,

inuse = 1

where answid in

(select top 1 answid from

callbacks

where inuse = 0

-- and additional criterias

-- and your additional criterias

order by answid

)

print @answ

go

Then you should get the answid with natural serialization and no need for a transaction.

By the way; are you aware that select 1 and 2 could be collapsed to one?

regrads

jensk

(Here is some TSQL for proofing:

create table callbacks (answid int, inuse int)

go

create clustered index aidx on callbacks( answid )

go

insert into callbacks values( 1, 0 )

insert into callbacks values( 2, 0 )

insert into callbacks values( 3, 1 )

insert into callbacks values( 4, 0 )

insert into callbacks values( 5, 1 )

insert into callbacks values( 6, 0 )

go

declare @answ int

update callbacks set

@answ = answid,

inuse = 1

where answid in

(select min(answid) from

callbacks

where inuse = 0

-- and additional criterias

-- and your additional criterias

)

print @answ

go

declare @answ int

update callbacks set

@answ = answid,

inuse = 1

where answid in

(select top 1 answid from

callbacks

where inuse = 0

-- and additional criterias

-- and your additional criterias

order by answid

)

print @answ

go

drop table callbacks

go

gibsonrobert SSC Veteran Points: 237 More actions · Answer 11

Indexing may help, it did for us. By decreasing the total time records are locked, you decrease the chance of a deadlock. Try using the EM index wizard and see if that doesn't help.

Scorpion_66 SSCertifiable Points: 7891 More actions · Answer 12

the sql submitted by jensk2 would solve the issue, and I believe would be the best performer.

Indexes for the appropriate columns would of course, be great, but wouldn't actually solve the issue, would only relieve some of the symptoms.