Understanding locking behavior causing deadlock

  • I'm currently investigating a deadlock issue on a heavily used database but still wasn't able to reproduce it even using concurrent ostress sessions or WHILE 1=1 EXEC StoredProcs. I'd be really glad anyone could shed some light on ways to reproduce it and also help me understand it's behavior.

    There are two tables involved and they're partitioned by a hash in a computed column based on the original PK. They were partitioned due to page latch issues on INSERTs in the past.

    Deadlock graph:

    deadlock-list
    deadlock victim=process48094508
    process-list
    process id=process48094508 taskpriority=0 logused=428 waitresource=KEY: 10:72057666620227584 (d79f02e56828) waittime=2832 ownerId=28655562221 transactionname=implicit_transaction lasttranstarted=2020-04-19T09:37:13.823 XDES=0x7d33ef970 lockMode=X schedulerid=15 kpid=22212 status=suspended spid=2684 sbid=0 ecid=0 priority=0 trancount=2 lastbatchstarted=2020-04-19T09:37:13.857 lastbatchcompleted=2020-04-19T09:37:13.857 clientapp=websphere1 hostname=wsserver1 hostpid=0 loginname=dblogin isolationlevel=repeatable read (3) xactid=28655562221 currentdb=10 lockTimeout=4294967295 clientoption1=673185824 clientoption2=128058
    executionStack
    frame procname=MYDB.dbo.pInsVarConn line=14 stmtstart=470 stmtend=690 sqlhandle=0x03000b00a3f2cc6277eb2bc0b8a900000100000000000000
    INSERT INTO tVarConn (nConn, iVarConn, rVarConn)
    VALUES (@nConn, @iVarConn, @rVarConn)
    frame procname=MYDB.dbo.pPrcInsVarConnao01 line=66 stmtstart=4890 stmtend=5016 sqlhandle=0x03000a004c202f3094703d009e1b00000100000000000000
    EXEC dbo.pInsVarConn @nConn, @iVarConn, @rVarConn
    frame procname=adhoc line=1 stmtstart=96 sqlhandle=0x01000a00b3117f0a40e2b01b0f0000000000000000000000
    EXEC pPrcInsVarConnao01 @P0,@P1,@P2
    inputbuf
    (@P0 bigint,@P1 varchar(8000),@P2 varchar(8000))EXEC pPrcInsVarConnao01 @P0,@P1,@P2
    process id=process48153948 taskpriority=0 logused=11224 waitresource=KEY: 10:72057666620227584 (0a7b9247f732) waittime=2832 ownerId=28655563535 transactionname=DELETE lasttranstarted=2020-04-19T09:37:13.857 XDES=0x2760ee83b0 lockMode=X schedulerid=34 kpid=31228 status=suspended spid=2553 sbid=0 ecid=0 priority=0 trancount=2 lastbatchstarted=2020-04-19T09:37:13.857 lastbatchcompleted=2020-04-19T09:37:13.263 clientapp=websphere1 hostname=wsserver1 hostpid=0 loginname=dblogin isolationlevel=repeatable read (3) xactid=28655563535 currentdb=10 lockTimeout=4294967295 clientoption1=673185824 clientoption2=128056
    executionStack
    frame procname=MYDB.dbo.pPrcEndConnao02 line=54 stmtstart=2480 stmtend=2666 sqlhandle=0x03000a00d95fa90184eb2b00b8aa00000100000000000000
    DELETE FROM dbo.tVarConn WHERE nConn = @IdConnao AND HashID = @HashID
    frame procname=adhoc line=1 stmtstart=132 sqlhandle=0x01000a00a7b5550840229a10010d00000000000000000000
    exec pPrcEndConnao02 @P0,@P1,@P2,@P3
    frame procname=unknown line=1 sqlhandle=0x000000000000000000000000000000000000000000000000
    unknown
    inputbuf
    (@P0 bigint,@P1 varchar(8000),@P2 varchar(8000),@P3 varchar(8000))exec pPrcEndConnao02 @P0,@P1,@P2,@P3
    resource-list
    keylock hobtid=72057666620227584 dbid=10 objectname=MYDB.dbo.tVarConn indexname=PKtVarConn id=lock495af880 mode=X associatedObjectId=72057666620227584
    owner-list
    owner id=process48153948 mode=X
    waiter-list
    waiter id=process48094508 mode=X requestType=wait
    keylock hobtid=72057666620227584 dbid=10 objectname=MYDB.dbo.tVarConn indexname=PKtVarConn id=lock1d7ff99b80 mode=X associatedObjectId=72057666620227584
    owner-list
    owner id=process48094508 mode=X
    waiter-list
    waiter id=process48153948 mode=X requestType=wait

    Thanks to this nice article http://www.sqlnotes.info/2012/06/26/locking-behavior-fks/(Found it to be easier to do this way when doing several times - don't have to issue SELECTs and DBCC PAGE), while trying to reproduce the scenario, I found out that the INSERT statement acquires the following locks:

    1. Obj-IS on tConn
    2. Obj-IX on tVarConn
    3. Page-IX on page from tVarConn
    4. Key-RI_NL to the next %%lockres%% from tVarConn
    5. Key-X on the %%lockres%% for the row being INSERTed
    6. Page-IS on a page from tConn
    7. Key-S on a %%lockres%% for the parent key value from tConn

    And the DELETE's:

    1. Obj-IX on tVarConn
    2. Page-IX on tVarConn
    3. Key-X on tVarConn (Several of them with different %%lockres%% since I have multiple rows for the nConn/HashID pair)

     

    Plans:

    yDuAgzDUgJ

     

    What am I missing? How can these two get involved in the deadlock above? How can I reproduce it so I can test possible solutions?

    Sorry if this got too long, I wanted to be throught..

    <!--more-->

    Parent table:

    Name      Owner    Type          
    --------- -------- -------------
    tConn dbo user table

    Column_name Type Computed Length Prec Scale Nullable TrimTrailingBlanks FixedLenNullInSource Collation
    -------------- ----------- ----------- --------- ----- ----- ----------- ---------------------- ------------------------- ----------------------------
    nConn int no 4 10 0 no (n/a) (n/a) NULL
    Col1 smallint no 2 5 0 no (n/a) (n/a) NULL
    Col2 char no 23 no no no SQL_Latin1_General_CP1_CI_AI
    Col3 datetime no 8 no (n/a) (n/a) NULL
    Col4 char no 20 no no no SQL_Latin1_General_CP1_CI_AI
    Col5 datetime no 8 no (n/a) (n/a) NULL
    Col6 varchar no 50 yes no yes SQL_Latin1_General_CP1_CI_AI
    Col7 char no 1 no no no SQL_Latin1_General_CP1_CI_AI
    Col8 smallint no 2 5 0 no (n/a) (n/a) NULL
    Col9 smallint no 2 5 0 no (n/a) (n/a) NULL
    Col10 smallint no 2 5 0 no (n/a) (n/a) NULL
    Col11 decimal no 9 10 0 no (n/a) (n/a) NULL
    Col12 decimal no 5 5 0 yes (n/a) (n/a) NULL
    HashID tinyint yes 1 3 0 no (n/a) (n/a) NULL

    Identity Seed Increment Not For Replication
    ----------- ------ ----------- -------------------
    nConn 1 1 0

    Data_located_on_filegroup
    --------------------------
    schPartConn

    index_name index_description index_keys
    ------------- ------------------------------------------------------ -----------------
    Idx1tConn nonclustered located on schPartConn Col2
    Idx2tConn nonclustered located on schPartConn Col1, Col4, Col7
    PKtConn clustered, unique, primary key located on schPartConn HashID, nConn

    constraint_type constraint_name delete_action update_action status_enabled status_for_replication constraint_keys
    ------------------------- ----------------- ------------- ------------- -------------- ---------------------- ----------------------------------
    FOREIGN KEY FktAplic01 No Action No Action Enabled Is_For_Replication Col1
    REFERENCES MYDB.dbo.tAplic (Col1)
    PRIMARY KEY (clustered) PKtConn (n/a) (n/a) (n/a) (n/a) HashID, nConn

    Table is referenced by foreign key
    -----------------------------------
    MYDB.dbo.tVarConn: FktConn01

    Child table:

    Name       Owner   Type       
    ---------- ------- -----------
    tVarConn dbo user table

    Column_name Type Computed Length Prec Scale Nullable TrimTrailingBlanks FixedLenNullInSource Collation
    -------------- ---------- ----------- -------- ----- ----- ----------- --------------------- ---------------------- -----------------------------
    nConn int no 4 10 0 no (n/a) (n/a) NULL
    iVarConn char no 20 no no no SQL_Latin1_General_CP1_CI_AI
    rVarConn varchar no 8000 no no no SQL_Latin1_General_CP1_CI_AI
    HashID tinyint yes 1 3 0 no (n/a) (n/a) NULL

    Identity Seed Increment Not For Replication
    ----------------------------- ------ ----------- -------------------
    No identity column defined. NULL NULL NULL

    Data_located_on_filegroup
    ---------------------------
    schPartConn

    index_name index_description index_keys
    --------------- -------------------------------------------------------- ------------------------
    PKtVarConn clustered, unique, primary key located on schPartConn HashID, nConn, iVarConn

    constraint_type constraint_name delete_action update_action status_enabled status_for_replication constraint_keys
    ------------------------ ------------------ ------------- ------------- -------------- ---------------------- ----------------
    FOREIGN KEY FktConn01 Cascade No Action Enabled Is_For_Replication HashID, nConn
    REFERENCES MYDB.dbo.tConn (HashID, nConn)
    PRIMARY KEY (clustered) PKtVarConn (n/a) (n/a) (n/a) (n/a) HashID, nConn, iVarConn

    No foreign keys reference table 'tVarConn', or you do not have permissions on referencing tables.

    Stored Procs:

    CREATE PROCEDURE [dbo].[pInsVarConn]  (
    @nConn int,
    @iVarConn char(20),
    @rVarConn varchar(8000)
    )
    AS
    BEGIN
    SET NOCOUNT ON

    INSERT INTO tVarConn (nConn, iVarConn, rVarConn)
    VALUES (@nConn, @iVarConn, @rVarConn)
    END
    go


    CREATE PROCEDURE [dbo].[pPrcInsVarConnao01] (
    @nConn int,
    @iVarConn char(20),
    @rVarConn varchar(8000)
    )
    AS
    BEGIN
    SET NOCOUNT ON

    DECLARE @HashID tinyint
    SET @HashID =CONVERT([tinyint],abs(@nConn%(72)),(0))

    IF EXISTS(SELECT TOP 1 1 FROM tVarConn WITH(NOLOCK) WHERE nConn = @nConn AND HashID = @HashID AND iVarConn = @iVarConn)
    BEGIN
    exec [pUpdtVarConn] @nConn, @iVarConn, @rVarConn
    END
    ELSE
    BEGIN
    exec [pInsVarConn] @nConn, @iVarConn, @rVarConn
    END
    END
    go


    CREATE PROCEDURE [dbo].[pPrcEndConnao02](
    @IdConnao int,
    @SeqGen char(23),
    @CurrEnd char(50),
    @Valid char(1) = 'N'
    )
    AS
    DECLARE @HashID tinyint
    DECLARE @EndUs char(50)
    DECLARE @StConnao char(1)
    DECLARE @RC smallint

    BEGIN
    SET NOCOUNT ON

    SET @HashID =CONVERT([tinyint],abs(@IdConnao%(72)),(0))

    SELECT
    @EndUs = Col6,
    @StConnao = Col7
    FROM
    dbo.tConn WITH(NOLOCK)
    WHERE
    nConn = @IdConnao AND
    HashID = @HashID AND
    Col2 = @SeqGen

    IF(@@ROWCOUNT = 0)
    BEGIN
    SET @RC = -5014
    END
    ELSE
    BEGIN
    IF(@stConnao = 'F')
    BEGIN
    SET @RC = -5012
    END
    ELSE
    BEGIN
    UPDATE dbo.tConn
    SET Col7 = 'F'
    WHERE nConn = @IdConnao
    AND HashID = @HashID
    AND Col2 = @SeqGen

    DELETE FROM dbo.tVarConn WHERE nConn = @IdConnao AND HashID = @HashID
    SET @RC = 0
    END
    END

    SELECT @RC AS Ret
    END

    Thanks a bunch!

    FT

    • This topic was modified 3 years, 11 months ago by  Fernando Toledo. Reason: Improved formatting
    • This topic was modified 3 years, 11 months ago by  Fernando Toledo.
  • I didn't study this in detail, but one thing I have noticed that people miss in deadlock analysis is that the processes may be involved in a multi-statement transaction. Look here:

    process id=process48094508 taskpriority=0 logused=428 waitresource=KEY: 10:72057666620227584 (d79f02e56828) waittime=2832 ownerId=28655562221 transactionname=implicit_transaction lasttranstarted=2020-04-19T09:37:13.823 XDES=0x7d33ef970 lockMode=X schedulerid=15 kpid=22212 status=suspended spid=2684 sbid=0 ecid=0 priority=0 trancount=2 lastbatchstarted=2020-04-19T09:37:13.857 lastbatchcompleted=2020-04-19T09:37:13.857 clientapp=websphere1 hostname=wsserver1 hostpid=0 loginname=dblogin isolationlevel=repeatable read (3) xactid=28655562221 currentdb=10 lockTimeout=4294967295 clientoption1=673185824 clientoption2=128058

    So this process is running a transaction, which started in a previous request from the client. Thus, it may be blocks from a previous operation that is blocking the other process. (That process appears to be running a single-statement transaction.)

    The isolation level Repeatable Read means that locks are held longer, although it may not matter here since the collision is on two X-locks.

     

    [font="Times New Roman"]Erland Sommarskog, SQL Server MVP, www.sommarskog.se[/font]

  • -- Edit: Forgot to quote a reply

  • Erland Sommarskog wrote:

    I didn't study this in detail, but one thing I have noticed that people miss in deadlock analysis is that the processes may be involved in a multi-statement transaction. Look here:

    process id=process48094508 taskpriority=0 logused=428 waitresource=KEY: 10:72057666620227584 (d79f02e56828) waittime=2832 ownerId=28655562221 transactionname=implicit_transaction lasttranstarted=2020-04-19T09:37:13.823 XDES=0x7d33ef970 lockMode=X schedulerid=15 kpid=22212 status=suspended spid=2684 sbid=0 ecid=0 priority=0 trancount=2 lastbatchstarted=2020-04-19T09:37:13.857 lastbatchcompleted=2020-04-19T09:37:13.857 clientapp=websphere1 hostname=wsserver1 hostpid=0 loginname=dblogin isolationlevel=repeatable read (3) xactid=28655562221 currentdb=10 lockTimeout=4294967295 clientoption1=673185824 clientoption2=128058

    Thanks for the reply Erland. I know the post ended up being a massive amount of info... Sorry!

    Anyways, the implicit transaction is executing pPrcInsVarConnao01, which doesn't open explicit transactions and exec pInsVarConn. This other proc runs a plain INSERT INTO tVarConn (nConn, iVarConn, rVarConn) VALUES (@nConn, @iVarConn, @rVarConn).

    Could this implicit_transaction hide a distributed tran/transaction open before the stored proc call?

  • The DELETE statement is deleting multiple rows with the same (nConn, HashId). I would guess the other process is trying to insert rows with the same key values. It seems that is now trying to insert an rVarConn the DELETE has deleted. The DELETE in its turn is trying to delete a row which the other process inserted earlier in the same transaction.

    I can't say that those NOLOCK that appears in the code gives me any confidence.

    [font="Times New Roman"]Erland Sommarskog, SQL Server MVP, www.sommarskog.se[/font]

  • Erland Sommarskog wrote:

    I didn't study this in detail, but one thing I have noticed that people miss in deadlock analysis is that the processes may be involved in a multi-statement transaction. Look here:

    process id=process48094508 taskpriority=0 logused=428 waitresource=KEY: 10:72057666620227584 (d79f02e56828) waittime=2832 ownerId=28655562221 transactionname=implicit_transaction lasttranstarted=2020-04-19T09:37:13.823 XDES=0x7d33ef970 lockMode=X schedulerid=15 kpid=22212 status=suspended spid=2684 sbid=0 ecid=0 priority=0 trancount=2 lastbatchstarted=2020-04-19T09:37:13.857 lastbatchcompleted=2020-04-19T09:37:13.857 clientapp=websphere1 hostname=wsserver1 hostpid=0 loginname=dblogin isolationlevel=repeatable read (3) xactid=28655562221 currentdb=10 lockTimeout=4294967295 clientoption1=673185824 clientoption2=128058

    So this process is running a transaction, which started in a previous request from the client. Thus, it may be blocks from a previous operation that is blocking the other process. (That process appears to be running a single-statement transaction.)

    The isolation level Repeatable Read means that locks are held longer, although it may not matter here since the collision is on two X-locks.

     

    You were absolutely right.

    Found out that there were resources being locked in a outer transaction opened by the java app which was disabling the autocommit setting to run several stored procedures and then commit.

    Thanks again, FT

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply