Reading Deadlock Trace

  • Hi I wonder if anyone can help me troubleshoot this deadlock?

    The input Buffer for SPID 829 is Proc [Database Id = 7 Object Id = 1003150619], I have found out what proc this is, but now what do I do to resolve the issue?

    Thanks for any help

    <deadlock-list>

    <deadlock victim="processff0ef8">

    <process-list>

    <process id="processfda868" taskpriority="0" logused="4288" waitresource="KEY: 7:72057594042777600 (9b0130c3a9ea)" waittime="5000" ownerId="76534496436" transactionname="user_transaction" lasttranstarted="2010-07-20T10:03:26.843" XDES="0x13c227030" lockMode="U" schedulerid="3" kpid="9960" status="suspended" spid="829" sbid="0" ecid="0" priority="0" transcount="2" lastbatchstarted="2010-07-20T10:03:26.843" lastbatchcompleted="2010-07-20T10:03:26.843" clientapp="Windows SharePoint Services" hostname="NIHR-MOSS1" hostpid="3112" loginname="xxx\xxx" isolationlevel="read committed (2)" xactid="76534496436" currentdb="7" lockTimeout="4294967295" clientoption1="673316896" clientoption2="128056">

    <executionStack>

    <frame procname="WSS_Search_NIHR-MOSS1.dbo.proc_MSS_ProcessDelete" line="162" stmtstart="13936" stmtend="14066" sqlhandle="0x03000700e2b8d63aa9ca57014a9b00000100000000000000">

    DELETE FROM MSSCrawlQueue WHERE DocID=@DocID AND BatchID = 0 </frame>

    <frame procname="WSS_Search_NIHR-MOSS1.dbo.proc_MSS_ProcessCommitted" line="251" stmtstart="21118" stmtend="22772" sqlhandle="0x030007001bddca3badca57014a9b00000100000000000000">

    EXEC dbo.proc_MSS_ProcessDelete

    @ProjectID,

    @HisContentSourceID,

    @StartAddressID,

    @DocID,

    @CrawlID,

    @HisCrawlID,

    @HisCommitCrawlID,

    @Scope,

    @HisTransactionFlags,

    @UseChangeLog,

    @ChangeLogCookie,

    @ChangeLogBatchID,

    @DeleteReason,

    @TransactionType,

    @HostDepth,

    @EnumerationDepth,

    @HisParentDocID,

    @EndPathFlag,

    @HostID,

    @LCID,

    @ErrorID,

    @ErrorLevel,

    @AccessURL,

    @AccessHash,

    @CompactURL,

    @CompactHash,

    @DisplayURL,

    @DisplayHash,

    @MaxDocId </frame>

    </executionStack>

    <inputbuf>

    Proc [Database Id = 7 Object Id = 1003150619] </inputbuf>

    </process>

    <process id="processff0ef8" taskpriority="0" logused="820" waitresource="KEY: 7:72057594042580992 (020068e8b274)" waittime="10000" ownerId="76534496514" transactionname="user_transaction" lasttranstarted="2010-07-20T10:03:26.843" XDES="0x2c0e5af70" lockMode="X" schedulerid="4" kpid="7460" status="suspended" spid="533" sbid="0" ecid="0" priority="0" transcount="2" lastbatchstarted="2010-07-20T10:03:26.843" lastbatchcompleted="2010-07-20T10:03:26.843" clientapp="Windows SharePoint Services" hostname="NIHR-MOSS1" hostpid="3112" loginname="xxx\xxx" isolationlevel="read committed (2)" xactid="76534496514" currentdb="7" lockTimeout="4294967295" clientoption1="673316896" clientoption2="128056">

    <executionStack>

    <frame procname="WSS_Search_NIHR-MOSS1.dbo.proc_MSS_UpdateHostStats" line="35" stmtstart="3098" sqlhandle="0x03000700a994e239a4ca57014a9b00000100000000000000">

    UPDATE MSSCrawlHostList SET

    SuccessCount = CASE WHEN SuccessCount + @SuccessCount > 0 THEN SuccessCount + @SuccessCount ELSE 0 END,

    ErrorCount = CASE WHEN ErrorCount + @ErrorCount > 0 THEN ErrorCount + @ErrorCount ELSE 0 END,

    WarningCount = CASE WHEN WarningCount + @WarningCount > 0 THEN WarningCount + @WarningCount ELSE 0 END

    WHERE HostID = @HostID </frame>

    <frame procname="WSS_Search_NIHR-MOSS1.dbo.proc_MSS_ProcessCommitted" line="134" stmtstart="13172" stmtend="13500" sqlhandle="0x030007001bddca3badca57014a9b00000100000000000000">

    EXEC dbo.proc_MSS_UpdateHostStats @HostID, @TransactionType, @MarkDelete, @ErrorID, @ErrorLevel, @HisCommitCrawlID, @HisErrorLevel, @TrackIDDel, @DelErrorLevel </frame>

    </executionStack>

    <inputbuf>

    Proc [Database Id = 7 Object Id = 1003150619] </inputbuf>

    </process>

    </process-list>

    <resource-list>

    <keylock hobtid="72057594042580992" dbid="7" objectname="WSS_Search_NIHR-MOSS1.dbo.MSSCrawlHostList" indexname="PK_MSSCrawlHostList" id="lockc3091d00" mode="X" associatedObjectId="72057594042580992">

    <owner-list>

    <owner id="processfda868" mode="X"/>

    </owner-list>

    <waiter-list>

    <waiter id="processff0ef8" mode="X" requestType="wait"/>

    </waiter-list>

    </keylock>

    <keylock hobtid="72057594042777600" dbid="7" objectname="WSS_Search_NIHR-MOSS1.dbo.MSSCrawlQueue" indexname="IX_MSSCrawlQueue_Cluster" id="lockdeaeef80" mode="X" associatedObjectId="72057594042777600">

    <owner-list>

    <owner id="processff0ef8" mode="X"/>

    </owner-list>

    <waiter-list>

    <waiter id="processfda868" mode="U" requestType="wait"/>

    </waiter-list>

    </keylock>

    </resource-list>

    </deadlock>

    </deadlock-list>

    1.

  • it is not easy to answer without knowing more, but ...

    change either:

    dbo.proc_MSS_ProcessDelete

    -- or --

    dbo.proc_MSS_UpdateHostStats

    to access tables:

    WSS_Search_NIHR-MOSS1.dbo.MSSCrawlHostList

    -- and --

    WSS_Search_NIHR-MOSS1.dbo.MSSCrawlQueue

    in the same order

    if the existing logic doesn't lend itself to a re-ordering

    consider using a select to obtain and hold a lock

  • Hi thanks for your reply. I think I am reading the XML incorrectly. I had thought that the code in the Input Buffer was the one executing during the deadlock.

    In this case for both SPIDS it says that Proc [Database Id = 7 Object Id = 1003150619] was the code running. This is a Stored Proc called proc_MSS_ProcessComitted.

    So am I wrong in thinking then that at the time of the deadlock both SPIDS were executing proc_MSS_ProcessComitted and it is that that requires the investigation?

    Any help on actually reading the Execution Stack would be helpful as its really confusing to me.

    Thanks

  • This code might help you out. Just take the deadlock XML info you posted above, put it into the @deadlock variable, and run.

    declare @deadlock xml

    set @deadlock = 'put your deadlock graph here'

    select

    [PagelockObject] = @deadlock.value('/deadlock-list[1]/deadlock[1]/resource-list[1]/pagelock[1]/@objectname', 'varchar(200)'),

    [DeadlockObject] = @deadlock.value('/deadlock-list[1]/deadlock[1]/resource-list[1]/objectlock[1]/@objectname', 'varchar(200)'),

    [Victim] = case when Deadlock.Process.value('@id', 'varchar(50)') = @deadlock.value('/deadlock-list[1]/deadlock[1]/@victim', 'varchar(50)') then 1 else 0 end,

    [Procedure] = Deadlock.Process.value('executionStack[1]/frame[1]/@procname[1]', 'varchar(200)'),

    [LockMode] = Deadlock.Process.value('@lockMode', 'char(1)'),

    [Code] = Deadlock.Process.value('executionStack[1]/frame[1]', 'varchar(1000)'),

    [ClientApp] = Deadlock.Process.value('@clientapp', 'varchar(100)'),

    [HostName] = Deadlock.Process.value('@hostname', 'varchar(20)'),

    [LoginName] = Deadlock.Process.value('@loginname', 'varchar(20)'),

    [TransactionTime] = Deadlock.Process.value('@lasttranstarted', 'datetime'),

    [InputBuffer] = Deadlock.Process.value('inputbuf[1]', 'varchar(1000)')

    from @deadlock.nodes('/deadlock-list/deadlock/process-list/process') as Deadlock(Process)

    From this, I see that you are running an UPDATE and a DELETE at the same time from your sharepoint portal (different procs).

    If you were to look at the execution plans for these commands, I bet you would find that one or both is doing table scans, effectively locking the entire table.

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • Oh wow I was just looking at some of your code on another post 🙂 it makes it much easier to read thanks.

    I do have some niggly questions like why does the LockMode U appear in the graph for the delete and vice versa? I assume its just the way the graph is rendered but it bugs me :s

    The other question is what importance does the Input Buffer have, I think I was following this as a red herring. The proc mentioned in the Input Buffer actually ends up being proc_MSS_ProcessCommitted how does this relate to the deadlock? This proc is also mentioned in the <frame procname for both processes and am confused as to its relevance in the graphs.

    I will also look at the execution plans for the two procs in questions, I no doubt will have more questions ..grr...

    Many thanks 🙂

  • wayne: that is very handy bit of sql

    it is impossible / unreasonable and unnecessary to stop all deadlocks

    as long as there is correct retry logic

    they are irrelevant

    if you have an application with no deadlock handling

    then you can add the deadlock retry in to the SPs with try catch

  • No I understand that but I was taking this opportunity to learn how to trouble shoot and understand the layout of the XML thats all.

  • doobya (7/21/2010)


    wayne: that is very handy bit of sql

    :blush: Thanks. It's been very handy for me also.

    it is impossible / unreasonable and unnecessary to stop all deadlocks

    It depends on what the cause of the deadlock is. It's not unreasonable to investigate each and every deadlock, and see what you can do about it.

    FWIW, when I started my job a bit over a year ago, we were getting regular deadlocks. After some work to set up traces to trap all deadlocks, and to investigate the causes and do some re-coding / indexing, deadlocks on our systems are now a very rare event.

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • Kwisatz78 (7/21/2010)


    Oh wow I was just looking at some of your code on another post 🙂 it makes it much easier to read thanks.

    I do have some niggly questions like why does the LockMode U appear in the graph for the delete and vice versa? I assume its just the way the graph is rendered but it bugs me :s

    The other question is what importance does the Input Buffer have, I think I was following this as a red herring. The proc mentioned in the Input Buffer actually ends up being proc_MSS_ProcessCommitted how does this relate to the deadlock? This proc is also mentioned in the <frame procname for both processes and am confused as to its relevance in the graphs.

    I will also look at the execution plans for the two procs in questions, I no doubt will have more questions ..grr...

    Many thanks 🙂

    The input buffer is the code that was called - it looks like you've identified it. It appears to contain both the update and delete statements.

    I was looking into your graph a little bit more, and noticed that my code isn't returning what object is being locked. Looking into your graph, I see that they are locking on indexes. So, I've modified the code to return the indexes being locked.

    select

    [PagelockObject] = @deadlock.value('/deadlock-list[1]/deadlock[1]/resource-list[1]/pagelock[1]/@objectname', 'varchar(200)'),

    [DeadlockObject] = @deadlock.value('/deadlock-list[1]/deadlock[1]/resource-list[1]/objectlock[1]/@objectname', 'varchar(200)'),

    [KeylockObject] = Keylock.Process.value('@objectname', 'varchar(200)'),

    [Index] = Keylock.Process.value('@indexname', 'varchar(200)'),

    [IndexLockMode] = Keylock.Process.value('@mode', 'varchar(5)'),

    [Victim] = case when Deadlock.Process.value('@id', 'varchar(50)') = @deadlock.value('/deadlock-list[1]/deadlock[1]/@victim', 'varchar(50)') then 1 else 0 end,

    [Procedure] = Deadlock.Process.value('executionStack[1]/frame[1]/@procname[1]', 'varchar(200)'),

    [LockMode] = Deadlock.Process.value('@lockMode', 'char(1)'),

    [Code] = Deadlock.Process.value('executionStack[1]/frame[1]', 'varchar(1000)'),

    [ClientApp] = Deadlock.Process.value('@clientapp', 'varchar(100)'),

    [HostName] = Deadlock.Process.value('@hostname', 'varchar(20)'),

    [LoginName] = Deadlock.Process.value('@loginname', 'varchar(20)'),

    [TransactionTime] = Deadlock.Process.value('@lasttranstarted', 'datetime'),

    [InputBuffer] = Deadlock.Process.value('inputbuf[1]', 'varchar(1000)')

    from @deadlock.nodes('/deadlock-list/deadlock/process-list/process') as Deadlock(Process)

    LEFT JOIN @deadlock.nodes('/deadlock-list/deadlock/resource-list/keylock') as Keylock(Process)

    ON Keylock.Process.value('owner-list[1]/owner[1]/@id', 'varchar(50)') =

    Deadlock.Process.value('@id', 'varchar(50)')

    This shows that the delete has an exclusive lock on the PK, and wants an update lock on the IX. The update has an exclusive lock on the IX, and wants an exclusive lock on the PK. The queries are pretty simple, so I'd check to see if there is an index on HostID, and one on DocID/BatchID.

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • FWIW, when I started my job a bit over a year ago, we were getting regular deadlocks. After some work to set up traces to trap all deadlocks, and to investigate the causes and do some re-coding / indexing, deadlocks on our systems are now a very rare event.

    my first big sql database was plagued with deadlocks during pre-production testing

    I managed to redesign the system and achieve zero deadlocks

    (it was a busy but simple / almost classic transaction processing application)

    I thought I could repeat that on any database ... but once the complexity increases

    the possible permutations of access becomes astronomical and what you have is a humungous race condition

    So now I design for zero deadlocks, fail then redesign for the high frequency deadlocks and let retry logic mop up the low frequency deadlocks

  • OP: use the deadlock xml to determine the statements and resources involved in the deadlock

    but to understand the deadlock you have to work backwards (possibly across multiple procedures / views etc.)

    and find the statements that led to the held locks

    because the xml doesn't tell you that

    (this can be nasty with a deep call stack and lots of conditionals)

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply