How To: Avoid Deadlocks

  • I have a complex sp that is responsible for selecting, locking and returning the next call in a call centre. It is tricky because I have to ensure that the same record is not returned to multiple users.

    It is quite a long sp that attempts to retrieve a record using a series of select statements, one after the other, until a suitable record is found. It then attempts an update on the record (inserting the sessionid of the user to essentially lock the record).

    My problem is that I have deadlocks occuring and cant figure out a way to avoid them. They are getting very serious at some sites.

    I have tried using updlock in a select, but that doesnt seem to help.

    I know that I could try to think of an alternative approach for selecting the next call (such as building and mantaining tables for my pools of calls using triggers) . . but this would pose other problems (besides the development time!).

    I have tried various ways to use transactions inside the sp - but they generally result in the same call going to multiple users.

    Any suggestions would be hugely appreciated. Here is the sp . . . .

    create Procedure usr_GetNextCall @SessionID int = 0

    AS

    declare @MaxCalls int

    declare @AnswerID int

    declare @user-id int

    declare @i int

    set nocount on

    set ansi_warnings off

    -- Only return 1 row

    set RowCount 1

    -- Store User ID

    select @user-id = (Select UserID from CurrentUsers Where SessionID = @SessionID)

    --

    -- Use a transaction

    --Begin Tran

    SET TRANSACTION ISOLATION LEVEL READ COMMITTED

    --

    -- Set A Default AnswerID

    select @AnswerID = 0

    select @i = 0

    --

    -- 1. Do Operator Specific Call-Backs

    Select @AnswerID =

    (

    Select Top 1 AnswerID

    from Answers_Eligible

    --With(updlock)

    Where SaveUserID = @user-id

    And (not CallBackStart is null)

    And (not CallBackEnd is null)

    And CallBackStart < GetDate()

    And CallBackEnd > GetDate()

    And (WhyIncompleteActionID = 3 or WhyIncompleteActionID = 5)

    Order By CallBackStart

    )

    -- Abort if row found

    If @AnswerID>0 Goto LockRow

    --

    -- 2. Do Operator Generic Call-Backs

    Select @AnswerID =

    (

    Select Top 1 AnswerID

    from Answers_Eligible

    --With(updlock)

    Where (not CallBackStart is null)

    And (not CallBackEnd is null)

    And CallBackStart < GetDate()

    And CallBackEnd > GetDate()

    And (WhyIncompleteActionID = 4)

    Order By CallBackStart

    )

    -- Abort if row found

    If @AnswerID>0 Goto LockRow

    --

    -- 3. Do Operator Specific Then Generic Call-Backs

    Select @AnswerID =

    (

    Select Top 1 AnswerID

    from Answers_Eligible

    --With(updlock)

    Where (not CallBackStart is null)

    And (not CallBackEnd is null)

    And GetDate() > DateAdd(mi, CallBackExpiry, CallBackStart)

    And CallBackEnd > GetDate()

    And (WhyIncompleteActionID = 5)

    Order By CallBackStart

    )

    -- Abort if row found

    If @AnswerID>0 Goto LockRow

    --

    -- 4. Do Forced Records (Top of Pool)

    Select @AnswerID =

    (

    Select Top 1 AnswerID

    from Answers_Eligible

    --With(updlock)

    Where (WhyIncompleteActionID = 6)

    Order By Priority, SaveDateTime

    )

    -- Abort if row found

    If @AnswerID>0 Goto LockRow

    --

    -- Start any retries here

    RetryStartHere:

    -- count loop iterations

    Select @i = @i + 1

    --

    -- 5. Do Remainder of Pool (including records forced into the bottom of the pool)

    Select @AnswerID =

    (

    Select Top 1 AnswerID

    from Answers_Eligible

    --With(updlock)

    WHERE (WhyIncompleteActionID = 2 OR WhyIncompleteActionID = 8 OR WhyIncompleteActionID = 7 OR WhyIncompleteActionID IS NULL)

    AND (MaxCalls > CallAttempt)

    AND (DATEADD(mi, DelayTime, SaveDateTime) < GETDATE())

    AND (CallBackStart IS NULL)

    AND (CallBackEnd IS NULL)

    OR (WhyIncompleteActionID = 2 OR WhyIncompleteActionID = 8 OR WhyIncompleteActionID = 7 OR WhyIncompleteActionID IS NULL)

    AND (MaxCalls > CallAttempt)

    AND (CallBackStart IS NULL)

    AND (CallBackEnd IS NULL)

    AND (SaveDateTime IS NULL)

    Order By Priority, DateAdd(mi, DelayTime, SaveDateTime), AnswerID

    )

    -- Abort if row found

    If @AnswerID>0

    begin

    Goto LockRow

    end

    Else

    begin

    -- must be no records available

    goto NoRecords

    end

    --

    -- Lock The Record and Return It

    LockRow:

    -- Lock the record

    --BEGIN TRAN

    update Answers set InUse = @SessionID Where AnswerID = @AnswerID And InUse = 0

    --COMMIT TRAN

    -- check for no update

    if @@ROWCOUNT = 0

    begin

    -- log looping

    --BEGIN TRAN

    Insert into LoopLog (SessionID, AnswerID, Counter, HitDateTime)

    Select @SessionID, @AnswerID, @i, GetDate()

    --COMMIT TRAN

    -- go to start of loop

    goto RetryStartHere

    end

    -- Return the result

    NoRecords:

    select @AnswerID

    Done:

    --Commit Tran

    --

  • One way would be increase the transaction level to serialized - though that will slow things down. Another is to just trap for deadlocks and have the app issue a retry.

    Andy

    http://www.sqlservercentral.com/columnists/awarren/

  • I have had success using the TABLOCKX locking hint. Although it goes against what seems like common logic, it actually increased performance by about 10x.

    I especially notice improvements when subqueries using the same table are involved. Here is a code snippet, although not the exact same as yours, you can get the idea:

    UPDATE Queue

    SET Status = 'H',

    Start_Time = GETDATE(),

    User_Login = @Uid

    FROM Queue WITH (TABLOCKX INDEX=PK_Queue)

    WHERE Client_Key = (SELECT TOP 1 Client_Key

    FROM Queue WITH (TABLOCKX INDEX=queue1)

    WHERE Project_ID = @Project

    AND Status = 'Q'

    AND ISNULL(Conf_Lev,0) < @Threshold_A

    AND ISNULL(Conf_Lev,0) > @Threshold_B)

    Another technique you may consider is trying to UPDATE at each condition instead of selecting the ID. Then check the @@rowcount after each UPDATE to see if you got something. By using this method, you lock the row (or table if using TABLOCKX) as you check the conditions.

    -Dan

    Edited by - dj_meier on 11/05/2002 10:17:34 AM


    -Dan

  • Thanks for your suggestions Dan. Andy's suggestions are not really constructive... slowing down the sp is not an option (we already have timeouts occurring), and of course the app already resubmites on deadlock, but this is not solving the issue.

    I had tried to issue an update immediately (as per your recommendation) . . but found that multiple users received the same record (oops!). I guess I need to lock the row in question when updating. I thought that the default isolation level of read committed would lock the row and prevent multiple updates occurring at once (I am begining to think that I will never uderstand locking in SQL Server??).

    I will try to issue an update with a row locking hint and see how goes.

    Thanks again.

    Dave

  • The field that is being updated needs to be checked in the WHERE clause. For example, if you are updating the InUse field, then it must say in the UPDATE, "WHERE InUse = 0", or something like that. (I'm assuming AnswerID is unique, and the SessionID value stays set after the user is done w/ the record)

    What about something like this:

    -- First Condition...

    update Answers

    set InUse = @SessionID

    Where AnswerID =

    (

    Select Top 1 AnswerID

    from Answers_Eligible WITH (TABLOCKX)

    Where SaveUserID = @user-id

    And (not CallBackStart is null)

    And (not CallBackEnd is null)

    And CallBackStart < GetDate()

    And CallBackEnd > GetDate()

    And (WhyIncompleteActionID = 3 or WhyIncompleteActionID = 5)

    Order By CallBackStart

    )

    And InUse = 0

    -- Abort if row found

    IF @@ROWCOUNT = 0

    BEGIN

    -- Second Condition

    -- (Your code here)...

    IF @@ROWCOUNT = 0

    BEGIN

    -- Third condition here

    END

    END

    -Dan


    -Dan

  • Thanks Dan. I have not had enough time to implement and test this suggestion . . but I will soon.

    I did not realise that to lock a col it should be in the 'where' clause of the update. Can anyone else please confirm this behaviour?

  • It does not have to be in the where clause to lock the column. That was not the reason for suggesting that course, I don't believe.

    I believe it was suggested to prevent more than a single select from grabbing the data, as one that was already being used would have INUSE <> 0. And its an excellent suggestion, in my opinion......for what thats worth.....

  • We have found it more convenient to separate the SELECT from the UPDATE in order to know if the failure was due to contention or end of queue. Also - ROWLOCK is probably sufficient. Something like:

    SELECT @GoodRow = 0, @AnswerID = NULL

    WHILE @GoodRow = 0

    BEGIN

    SELECT @AnswerID = Top 1 AnswerID

    FROM Answers_Eligible WITH(ROWLOCK)

    WHERE <Condition>

    IF @AnswerID IS NULL

    SELECT @GoodRow = 2

    ELSE

    BEGIN

    UPDATE Answers_Eligible SET InUse = @SessionID

    WHERE AnswerID = @AnswerID AND InUse = 0

    IF @@ROWCOUNT = 1

    SELECT @GoodRow = 1

    END

    IF @GoodRow

    <Code for handling the row>

    ELSE

    <Code for end of the queue>

  • With the select and the update in seperate statements, I believe its possible for two occurrances of this to grab the same row, and the last one to update the record would win. If its done this way, I would think that you would need to wrap the select and the update in a transaction to prevent another occurrance of it from grabbing the same row prior to the update. I think that it would be hard to get two occurrances to get the same row that way, but it would be possible. The read committed transaction level would not prevent two users from reading the same row until the update was initialized. A transaction would lock the row read until the batch of statements finished, where it is obvious that the rowlock does not hold the row locked in between the select and update as the row could not be updated if it was.

    If it is to be in one statement, the update, and the identity passed back, you do not have that issue. If it to be in two statements, wrap it in a transaction to prevent this behavior.

  • quote:


    I have a complex sp that is responsible for selecting, locking and returning the next call in a call centre. It is tricky because I have to ensure that the same record is not returned to multiple users.

    It is quite a long sp that attempts to retrieve a record using a series of select statements, one after the other, until a suitable record is found. It then attempts an update on the record (inserting the sessionid of the user to essentially lock the record).

    My problem is that I have deadlocks occuring and cant figure out a way to avoid them. They are getting very serious at some sites.

    I have tried using updlock in a select, but that doesnt seem to help.

    I know that I could try to think of an alternative approach for selecting the next call (such as building and mantaining tables for my pools of calls using triggers) . . but this would pose other problems (besides the development time!).

    I have tried various ways to use transactions inside the sp - but they generally result in the same call going to multiple users.

    Any suggestions would be hugely appreciated. Here is the sp . . . .

    create Procedure usr_GetNextCall @SessionID int = 0

    AS

    declare @MaxCalls int

    declare @AnswerID int

    declare @user-id int

    declare @i int

    set nocount on

    set ansi_warnings off

    -- Only return 1 row

    set RowCount 1

    -- Store User ID

    select @user-id = (Select UserID from CurrentUsers Where SessionID = @SessionID)

    --

    -- Use a transaction

    --Begin Tran

    SET TRANSACTION ISOLATION LEVEL READ COMMITTED

    --

    -- Set A Default AnswerID

    select @AnswerID = 0

    select @i = 0

    --

    -- 1. Do Operator Specific Call-Backs

    Select @AnswerID =

    (

    Select Top 1 AnswerID

    from Answers_Eligible

    --With(updlock)

    Where SaveUserID = @user-id

    And (not CallBackStart is null)

    And (not CallBackEnd is null)

    And CallBackStart < GetDate()

    And CallBackEnd > GetDate()

    And (WhyIncompleteActionID = 3 or WhyIncompleteActionID = 5)

    Order By CallBackStart

    )

    -- Abort if row found

    If @AnswerID>0 Goto LockRow

    --

    -- 2. Do Operator Generic Call-Backs

    Select @AnswerID =

    (

    Select Top 1 AnswerID

    from Answers_Eligible

    --With(updlock)

    Where (not CallBackStart is null)

    And (not CallBackEnd is null)

    And CallBackStart < GetDate()

    And CallBackEnd > GetDate()

    And (WhyIncompleteActionID = 4)

    Order By CallBackStart

    )

    -- Abort if row found

    If @AnswerID>0 Goto LockRow

    --

    -- 3. Do Operator Specific Then Generic Call-Backs

    Select @AnswerID =

    (

    Select Top 1 AnswerID

    from Answers_Eligible

    --With(updlock)

    Where (not CallBackStart is null)

    And (not CallBackEnd is null)

    And GetDate() > DateAdd(mi, CallBackExpiry, CallBackStart)

    And CallBackEnd > GetDate()

    And (WhyIncompleteActionID = 5)

    Order By CallBackStart

    )

    -- Abort if row found

    If @AnswerID>0 Goto LockRow

    --

    -- 4. Do Forced Records (Top of Pool)

    Select @AnswerID =

    (

    Select Top 1 AnswerID

    from Answers_Eligible

    --With(updlock)

    Where (WhyIncompleteActionID = 6)

    Order By Priority, SaveDateTime

    )

    -- Abort if row found

    If @AnswerID>0 Goto LockRow

    --

    -- Start any retries here

    RetryStartHere:

    -- count loop iterations

    Select @i = @i + 1

    --

    -- 5. Do Remainder of Pool (including records forced into the bottom of the pool)

    Select @AnswerID =

    (

    Select Top 1 AnswerID

    from Answers_Eligible

    --With(updlock)

    WHERE (WhyIncompleteActionID = 2 OR WhyIncompleteActionID = 8 OR WhyIncompleteActionID = 7 OR WhyIncompleteActionID IS NULL)

    AND (MaxCalls > CallAttempt)

    AND (DATEADD(mi, DelayTime, SaveDateTime) < GETDATE())

    AND (CallBackStart IS NULL)

    AND (CallBackEnd IS NULL)

    OR (WhyIncompleteActionID = 2 OR WhyIncompleteActionID = 8 OR WhyIncompleteActionID = 7 OR WhyIncompleteActionID IS NULL)

    AND (MaxCalls > CallAttempt)

    AND (CallBackStart IS NULL)

    AND (CallBackEnd IS NULL)

    AND (SaveDateTime IS NULL)

    Order By Priority, DateAdd(mi, DelayTime, SaveDateTime), AnswerID

    )

    -- Abort if row found

    If @AnswerID>0

    begin

    Goto LockRow

    end

    Else

    begin

    -- must be no records available

    goto NoRecords

    end

    --

    -- Lock The Record and Return It

    LockRow:

    -- Lock the record

    --BEGIN TRAN

    update Answers set InUse = @SessionID Where AnswerID = @AnswerID And InUse = 0

    --COMMIT TRAN

    -- check for no update

    if @@ROWCOUNT = 0

    begin

    -- log looping

    --BEGIN TRAN

    Insert into LoopLog (SessionID, AnswerID, Counter, HitDateTime)

    Select @SessionID, @AnswerID, @i, GetDate()

    --COMMIT TRAN

    -- go to start of loop

    goto RetryStartHere

    end

    -- Return the result

    NoRecords:

    select @AnswerID

    Done:

    --Commit Tran

    --


  • I have a complex sp that is responsible for selecting, locking and returning the next call in a call centre. It is tricky because I have to ensure that the same record is not returned to multiple users.

    I would go for updating the row to process immediately, instead of first selecting, then updating, which would actually require BEGIN TRAN + UPDLOC + COMMIT TRAN to ensure that multiple users do not get the same row.

    If you replace all your selects with stuuf like:

    declare @answ int

    update callbacks set

    @answ = answid,

    inuse = 1

    where answid in

    (select min(answid) from

    callbacks

    where inuse = 0

    -- and additional criterias

    -- and your additional criterias

    )

    print @answ

    go

    or:

    declare @answ int

    update callbacks set

    @answ = answid,

    inuse = 1

    where answid in

    (select top 1 answid from

    callbacks

    where inuse = 0

    -- and additional criterias

    -- and your additional criterias

    order by answid

    )

    print @answ

    go

    Then you should get the answid with natural serialization and no need for a transaction.

    By the way; are you aware that select 1 and 2 could be collapsed to one?

    regrads

    jensk

    (Here is some TSQL for proofing:

    create table callbacks (answid int, inuse int)

    go

    create clustered index aidx on callbacks( answid )

    go

    insert into callbacks values( 1, 0 )

    insert into callbacks values( 2, 0 )

    insert into callbacks values( 3, 1 )

    insert into callbacks values( 4, 0 )

    insert into callbacks values( 5, 1 )

    insert into callbacks values( 6, 0 )

    go

    declare @answ int

    update callbacks set

    @answ = answid,

    inuse = 1

    where answid in

    (select min(answid) from

    callbacks

    where inuse = 0

    -- and additional criterias

    -- and your additional criterias

    )

    print @answ

    go

    declare @answ int

    update callbacks set

    @answ = answid,

    inuse = 1

    where answid in

    (select top 1 answid from

    callbacks

    where inuse = 0

    -- and additional criterias

    -- and your additional criterias

    order by answid

    )

    print @answ

    go

    drop table callbacks

    go

  • Indexing may help, it did for us. By decreasing the total time records are locked, you decrease the chance of a deadlock. Try using the EM index wizard and see if that doesn't help.

  • the sql submitted by jensk2 would solve the issue, and I believe would be the best performer.

    Indexes for the appropriate columns would of course, be great, but wouldn't actually solve the issue, would only relieve some of the symptoms.

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply