I received an email from my boss on the morning supervisors approve timesheets. He stated that they are complaining about an error: Deadlock. Searching the SQL Log, I was able to get the details of the deadlocking. We have Trace Flags 1204 and 1222 turned on which provides the details.
Once I got the details and having followed Bart Duncan’s Deadlock Troubleshooting, I found the victim and successful DML statements.
Next, I started a trace to get some parameter values for the victim T-SQL. By the way, the UPDATE statement was the higher priority statement that succeeded and a SELECT was the deadlock victim. Not the classic UPDATE/UPDATE example so many people use to illustrate deadlocking. When I got the Execution Plan for the victim, I noticed the Seek was on a Non-Clustered index with the INCLUDE option (Covering Index). After the Seek, there was a lookup on the Cluster Index.
Now, months ago I added the covering index to help this same SQL statement, but now it needed 2 extra columns from original suggested index.
What changed? Well, before I got the answer on what changed, I dropped the index so the users could proceed and not get the error, with some performance issues (it was taking 5-6 seconds instead of less than one)
What Changed? The previous week, the development team released a new version of the application and add 2 columns to the SELECT statement.
So, here is an example of what happened with the Adventure Works database.
If you run the above query with Include Execution Plan, you will get a Missing Index like the following:
The Execution Plan is using a Clustered Index scan to find the data with a cost of 0.54
Adding the index, the Execution Plan now does an Index Seek with a cost of 0.0033. Big improvement!!!
Now, when I add a column to the SELECT query that is not in the Covering Index:
The INCLUDE column index (covering index) is used in a SEEK, but then a Key Lookup is added to get the additional 2 columns from the Clustered Index.
So, how does this cause a deadlock. I have seen this problem many times over the last 5-6 years.
The SELECT statement starts to SEEK the Covering Index at the same time an UPDATE statement locks the clustered index to update, then tries to update the covering Index. But, the SELECT statement has a shared lock on the Covering Index (Non-clustered Index) that is now trying to place a shared lock on the Clustered Index, which is locked by the UPDATE statement. BOOM!!! Deadlock!!!
How do you fix this?
First, you can add the 2 columns to the covering index and the SELECT no longer needs a Key Lookup on the clustered index. This is fine, but I wish it could be found in development and not production.
Or, you could just remove the Covering Index (like I initially did) and find other ways with the development team to improve performance. I do not believe they even knew what I did when I added the Covering Index, nor did I know that had added some additional columns to the SELECT statement.