Deadlock investigation, help!

  • Hi All,

    I know this is a general cry for help for my specific deadlock but hoping you can help understand the graph, or let me know what i'm missing.

    SQL Server 2008 R2

    1 tables involved: Table A

    TableA - no indexes! Ok, the reason for this is i wanted to check if indexes were a cause of deadlocks, because originally it had 21 non-clustered indexes, and no clustered index! So i made it a heap table and still received the same deadlocks.

    But now i'm thinking that a clustered index may help resolve the deadlock issue... *Maybe*

    Here's the graph:

    deadlock-list

    deadlock victim=process4911948

    process-list

    process id=process4911948 taskpriority=0 logused=0 waitresource=OBJECT: 5:21575115:0 waittime=4476 ownerId=118226314 transactionname=implicit_transaction lasttranstarted=2012-08-30T15:26:20.823 XDES=0x3eab42c60 lockMode=X schedulerid=6 kpid=500 status=suspended spid=95 sbid=0 ecid=0 priority=0 trancount=2 lastbatchstarted=2012-08-30T15:26:20.867 lastbatchcompleted=2012-08-30T15:26:20.840 clientapp=Livelink Enterprise Server hostname=SANLLINK2-STG hostpid=6472 loginname=livelink isolationlevel=read committed (2) xactid=118226314 currentdb=5 lockTimeout=4294967295 clientoption1=671088672 clientoption2=128058

    executionStack

    frame procname=adhoc line=1 stmtstart=72 sqlhandle=0x02000000548063157079ca7ed9f4302a5e5655a6f5ccbb5c

    update TableA set ExtendedData = @P1 where Name = @P2 and SubType = @P3

    frame procname=unknown line=1 sqlhandle=0x000000000000000000000000000000000000000000000000

    unknown

    inputbuf

    (@P1 ntext,@P2 nvarchar(56),@P3 int)update TableA set ExtendedData = @P1 where Name = @P2 and SubType = @P3

    process id=process49454c8 taskpriority=0 logused=2448 waitresource=OBJECT: 5:21575115:0 waittime=4476 ownerId=118226298 transactionname=implicit_transaction lasttranstarted=2012-08-30T15:26:20.763 XDES=0x5c7ab5970 lockMode=X schedulerid=8 kpid=2400 status=suspended spid=92 sbid=0 ecid=0 priority=0 trancount=2 lastbatchstarted=2012-08-30T15:26:20.820 lastbatchcompleted=2012-08-30T15:26:20.817 clientapp=Livelink Enterprise Server hostname=SANLLINK2-STG hostpid=6472 loginname=livelink isolationlevel=read committed (2) xactid=118226298 currentdb=5 lockTimeout=4294967295 clientoption1=671088672 clientoption2=128058

    executionStack

    frame procname=adhoc line=1 stmtstart=1070 sqlhandle=0x020000002e5b8b37b467ec8f8d31844d950625041254ec2e

    insert into TableA ( OwnerID,ParentID,DataID,Name,OriginOwnerID,OriginDataID,UserID,GroupID,UPermissions,GPermissions,WPermissions,SPermissions,ACLCount,PermID,DataType,Reserved,SubType,CreatedBy,CreateDate,ModifyDate,MaxVers,ReservedBy,ReservedDate,VersionNum,DComment,DCategory,ExAtt1,ExAtt2,Ordering,Major,Minor,ReleaseRef,ChildCount,AssignedTo,DateAssigned,DateEffective,DateDue,DateExpiration,DateStarted,DateCompleted,Status,Priority,GIF,ExtendedData,Catalog,CacheExpiration,Deleted,ModifiedBy ) values (@P1,@P2,@P3,@P4,@P5,@P6,@P7,@P8,@P9,@P10,@P11,@P12,@P13,@P14,@P15,@P16,@P17,@P18,@P19,@P20,@P21,@P22,@P23,@P24,@P25,@P26,@P27,@P28,@P29,@P30,@P31,@P32,@P33,@P34,@P35,@P36,@P37,@P38,@P39,@P40,@P41,@P42,@P43,@P44,@P45,@P46,@P47,@P48)

    frame procname=unknown line=1 sqlhandle=0x000000000000000000000000000000000000000000000000

    unknown

    inputbuf

    (@P1 int,@P2 int,@P3 int,@P4 nvarchar(24),@P5 int,@P6 int,@P7 int,@P8 int,@P9 int,@P10 int,@P11 int,@P12 int,@P13 int,@P14 char(1),@P15 char(1),@P16 int,@P17 int,@P18 int,@P19 datetime,@P20 datetime,@P21 int,@P22 int,@P23 char(1),@P24 int,@P25 nvarchar(1),@P26 char(1),@P27 char(1),@P28 char(1),@P29 char(1),@P30 char(1),@P31 char(1),@P32 char(1),@P33 int,@P34 char(1),@P35 char(1),@P36 char(1),@P37 char(1),@P38 char(1),@P39 char(1),@P40 char(1),@P41 char(1),@P42 char(1),@P43 char(1),@P44 char(1),@P45 int,@P46 int,@P47 int,@P48 int)insert into TableA ( OwnerID,ParentID,DataID,Name,OriginOwnerID,OriginDataID,UserID,GroupID,UPermissions,GPermissions,WPermissions,SPermissions,ACLCount,PermID,DataType,Reserved,SubType,CreatedBy,CreateDate,ModifyDate,MaxVers,ReservedBy,ReservedDate,VersionNum,DComment,DCategory,ExAtt1,ExAtt2,Ordering,Major,Minor,ReleaseRef,ChildCount,AssignedTo,DateAssigned,DateEffective,DateDue,DateExpiration,DateStarted,DateCompleted,Status,Priority,GIF,ExtendedData,Catalog,CacheExpiration,Deleted,

    resource-list

    objectlock lockPartition=0 objid=21575115 subresource=FULL dbid=5 objectname=dbA.dbo.TableA id=lock1fd639c80 mode=IX associatedObjectId=21575115

    owner-list

    owner id=process49454c8 mode=IX

    waiter-list

    waiter id=process4911948 mode=X requestType=convert

    objectlock lockPartition=0 objid=21575115 subresource=FULL dbid=5 objectname=dbA.dbo.TableA id=lock1fd639c80 mode=IX associatedObjectId=21575115

    owner-list

    owner id=process4911948 mode=IX

    waiter-list

    waiter id=process49454c8 mode=X requestType=convert

    Am i reading this correctly?

    Process 49454c8 is holding an IX lock on TableA

    Process 4911948 is also holding an IX lock on TableA

    Both want to convert to an X lock to do their INSERT or UPDATE. But neither can convert b/c the other process is holding an IX lock...so deadlock occurs?

    Is that what it's saying?

    Thanks in advance!

  • The Intent Exclusive locks trying to convert to exclusive aren't your direct issue here. They'll attempt to escalate at 5000 rows but it's not required, it's just an optimization to keep lock usage down to reasonable volumes.

    Can you confirm that you get TableA back as the object in the following query, please? By preference, please simply post the results, but I realize you've modified things slightly to obfuscate certain things... well, unless you actually HAVE a table called TableA.

    SELECT * FROM sys.sysobjects where [id] = 21575115

    Also please reconfirm TableA has no indexes whatsoever, also check for unique constraints and anything else that might be hanging around.

    By rights, an Update and an Insert should not collide in a non-indexed heap in any way. Inserts will tail insert and updates will work against existing data pages. Even splits shouldn't affect it. Something is odd here.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Hi, thanks for the response.

    The query you gave above did return 'TableA' , and yep i double-checked that 3 times previously too lol.

    I am certain there are no indexes and is indeed a heap table.

    There is one constraint on a column, DEFAULT of newid() on a guid column

  • Yeah, sorry I'm not directly helping here. Your assumptions and reading of the graph are accurate. The queries are both conflicting on an IX - X conversion for the same resource (21575115, said table). However, I know of no reason a Heap would conflict on that, the IX is just an optimization piece and shouldn't be required.

    The Update itself is tight and affects very little.

    Hrm. Either I'm missing something obvious or my knowledge isn't up to the task. Calling in the calvary. 🙂


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Is the table involved in a foreign key relationship?

  • I'm on my phone, so I may not be reading this right, but it looks like both spids have a Tran count of 2, so could there be other tsql in play here?

  • If you have the deadlock graph in XML format, use my script (http://blog.waynesheffield.com/wayne/code-library/shred-deadlock-graph/[/url]) to shred it and see what all is going on. If this doesn't help, post the XML and I'll see what I can do with it.

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • Jo Pattyn (8/30/2012)


    Is the table involved in a foreign key relationship?

    Nope

  • You're getting a deadlock between one process that is updating a large object (text) column, and another that is inserting a new row (which will also include the text column, naturally). My strong suspicion is that the deadlock is occurring during management of the text pages. As you may know, text data is stored off row, in separate pages dedicated for large object use. There's a lot of potential detail here, since exactly how the large object data is stored depends on its size (smaller ones may share a page with other rows from the same partition, for example) and changes in size can result in things moving around in complex ways.

    If you are looking for an exciting life (!) one very good approach is to use the old LOB types (text, ntext and image) in a heap structure that encounters lots of concurrent INSERT/UPDATE/DELETE activity, especially if the old-style LOB columns change size frequently. For extra credit, you can have twenty-odd non-clustered indexes, a very large number of table columns, and some wide data types (e.g. Name nvarchar(56) = up to 112 bytes + overhead per row) 🙂

    I prefer a quiet life myself, where things just work. The design is not helping you here, and you may not be performing the right maintenance regularly to remove ghost and forwarded records, compact LOB space, and generally tidy things up:

    ALTER TABLE dbo.Heap REBUILD;

    ALTER INDEX ALL ON dbo.Heap REORGANIZE WITH (LOB_COMPACTION = ON);

    I would expect that making the heap a clustered table could only help in general, but it will not solve all of your problems, and perhaps not the deadlock issue either. It is probably going to be too hard to nail down the precise cause of your deadlock, but I would definitely want to have an index to help the UPDATE query:

    CREATE UNIQUE NONCLUSTERED INDEX uq1 ON dbo.Heap (Name, SubType)

    I hesitate to suggest that as the clustered index, since the Name column is potentially quite wide.

    Another thing I might try is to add a WITH (PAGLOCK) or WITH (TABLOCK) hint to the INSERT query, assuming the code is accessible to you. This second suggestion is more out of curiosity to see if it helps rather than a serious long-term solution, though.

  • SQL Kiwi (8/30/2012)


    You're getting a deadlock between one process that is updating a large object (text) column, and another that is inserting a new row (which will also include the text column, naturally). My strong suspicion is that the deadlock is occurring during management of the text pages. As you may know, text data is stored off row, in separate pages dedicated for large object use. There's a lot of potential detail here, since exactly how the large object data is stored depends on its size (smaller ones may share a page with other rows from the same partition, for example) and changes in size can result in things moving around in complex ways.

    I would expect that making the heap a clustered table could only help in general, but it will not solve all of your problems, and perhaps not the deadlock issue either. It is probably going to be too hard to nail down the precise cause of your deadlock, but I would definitely want to have an index to help the UPDATE query:

    Another thing I might try is to add a WITH (PAGLOCK) or WITH (TABLOCK) hint to the INSERT query, assuming the code is accessible to you. This second suggestion is more out of curiosity to see if it helps rather than a serious long-term solution, though.

    Thanks for this. Couple other items of interest:

    - we never REBUILD because of the requirement of the system to stay UP all the time

    - we REORGANIZE weekly but on a heap table, i believe the non-clustered indexes will still be fragmented

    What i can do is try to add the non-clustered index and a potential clustered index on the table (on DataID). Then i'll test again.

    I also thought about giving a row locking hint (PAGLOCK) not (TABLOCK), because i the graph isn't it trying to lock the table to do the changes? My theory is that if the INSERT and UPDATE does a PAGLOCK then maybe a deadlock won't occur because the processes are locking at the page level rather than the table level..thus reducing the chance of wanting each other's resources...does that even make sense?

    I barely have access to the code as the app that's sending the queries is developed by a third-party..but i have rare access to them if absolutley needed..just wondering if this can be fixed at the server side before going to client side..

  • msandico 57892 (8/30/2012)


    we never REBUILD because of the requirement of the system to stay UP all the time

    REBUILD supports the ONLINE option. I presume you have Enterprise Edition and are using the ONLINE option already when adding indexes.

    we REORGANIZE weekly but on a heap table, i believe the non-clustered indexes will still be fragmented

    ALTER INDEX ... REORGANIZE allows you to specify ALL indexes or individual ones. Both ALTER INDEX ALL and ALTER TABLE REBUILD affect the heap and all non-clustered indexes, as far as I recall.

    I also thought about giving a row locking hint (PAGLOCK) not (TABLOCK), because in the graph isn't it trying to lock the table to do the changes? My theory is that if the INSERT and UPDATE does a PAGLOCK then maybe a deadlock won't occur because the processes are locking at the page level rather than the table level..thus reducing the chance of wanting each other's resources...does that even make sense?

    That's the broad theory of it, yes.

    I barely have access to the code as the app that's sending the queries is developed by a third-party..but i have rare access to them if absolutley needed..just wondering if this can be fixed at the server side before going to client side..

    Not that I know of. We can't use a plan guide to specify a lock-granularity hint for the table using the TABLE HINT syntax because it affects the semantic of the query (you get error 8722 if you try this).

  • Hi All,

    Just to finish this off, thanks for all the help again.

    I figured out that it was a missing clustered index and defragmentation (or fragmentation, whichever way you look at it, i think specifically data fragmentation), that was the culprit.

    From the graph, my theory was that 2 ad hoc INSERT processes were trying to insert into the table. For some reason, since it was a heap, it was trying to escalate it's lock (i'm assuming at the page level), to a table level lock. Since the two processes occurred within milliseconds of each other, it each grabbed a page, and were both trying to escalate to a table-level lock.....hence, DEADLOCK.

    With the clustered index, i was hoping that it grabbed a page-level lock (or something more granular than a table-level lock), and the lock was held quicker b/c SQL knew exactly where to insert the row. and thus prevent a deadlock..

    Hopefully that reasoning made sense..from a technical persepctive.

  • Hi All,

    Just to finish this off, thanks for all the help again.

    I figured out that it was a missing clustered index and defragmentation (or fragmentation, whichever way you look at it, i think specifically data fragmentation), that was the culprit.

    From the graph, my theory was that 2 ad hoc INSERT processes were trying to insert into the table. For some reason, since it was a heap, it was trying to escalate it's lock (i'm assuming at the page level), to a table level lock. Since the two processes occurred within milliseconds of each other, it each grabbed a page, and were both trying to escalate to a table-level lock.....hence, DEADLOCK.

    With the clustered index, i was hoping that it grabbed a page-level lock (or something more granular than a table-level lock), and the lock was held quicker b/c SQL knew exactly where to insert the row. and thus prevent a deadlock..

    Hopefully that reasoning made sense..from a technical persepctive.

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply