Deadlocking advice on a frequently rebuilt reporting table

Question

Post reply

Deadlocking advice on a frequently rebuilt reporting table

ShawnTherrien

Default port

Points: 1481
More actions
June 23, 2011 at 1:00 am

#389824

Standard Procedure: Changing specific details of an employee (trigger with a cursor; cursor later removed, trigger remains.) will generate some history records and then run back over the updated employees history, build the reporting table for the employee in a temp table, then compared against the real reporting table. Old records purged, new/altered records added.
This happens three times in a row, once for each of three reporting tables.
It's extremely rare that the same employee will have multiple reporting table rebuilds triggered simultaneously (ie, the same rows possibly being referenced). I consider this to be an acceptable deadlock. It shouldn't occur and I'd prefer it not cause a deadlock, but it is what it is.
The Problem Happening: In each reporting table, as the last two steps, irrelevant records for the employee are deleted, records not existing in the reporting table are inserted. If this rebuild is triggered for a high number of deadlocks occur on this last delete/insert step.
At this point in the year, things are incredibly busy for about a two week span. I need to resolve the deadlock issue before then. This is about a week away.
Attempted Resolution 1: Set Isolation Level to ReadCommited Snapshot. This seemed to help a bit, but mostly with read issues. Two deletes/inserts could easily still cause a deadlock.
Attempted Resolution 2: Set WITH (TABLOCK) on the DELETE/INSERT statements at the end. This prevents multiple IX's crossing over for the most part and seemed to cut things down about 80%. With this fix, the building of history really isn't an issue during normal usage.
Attempted Resolution 3: This is the point I removed the CURSOR from the TRIGGER. At this point, performing these operations for multiple students speeds up 10%-10,000%. The larger the bulk update, the larger the gains. This is where Caveat 1 came into play, basically altering the application to update in sets is not acceptable. performance gains are made, but we're looking at the 10% end of the spectrum, not the 10,000% end.
Attempted Resolution 4: Attempted changing the WITH (TABLOCK) to WITH (PAGLOCK) and then WITH (ROWLOCK), hoping to lower contention on the table being modified by minimizing the scope of it's lock. This didn't really work with PAGLOCK, likely because records are added to the end, you're still dealing with the same page often. ROWLOCK just seemed to give me all sorts of stranger deadlocks. I believe this is effectively allowing Intent Exclusives (IX) again because the locking mechanism isn't being stopped by the TABLOCK.
Attempted Resolution 5: The reporting table is built for each employee in the trigger right after all of their history is modified. Rather than building it right here, I added their ID to a queue and set a job every 10 seconds (the minimum frequency??) to process the reporting tables in a set for multiple employees at once.
The end result with 200 Updates/Rebuilds over 4 connections in a minute was no deadlocks because the Reporting Table refresh was never running over itself. Also as a set this reduced computing time from 1 second per employee to about 1 second for 10 employees. This was turned down in Caveat 2 because of the possibility of having to wait up to 12 seconds for the reporting tables to reflect changes and the chance that the application might be restricted artificially in the future.
Caveat 1: Application is newly using the Entity Framework. I'm told by the development team that EF isn't designed to work in sets and that it's not best practices to override EF's SQL generation with their own. Nobody on the team has extensive experience with EF mostly experience with Linq To SQL. I think this the first EF project for everyone.
The update of each employee is handled in a service that will only update one employee at a time. I'm not sure how much of this is actually an EF limitation, but it is in the design to have services process CRUD one item at a time for maximum flexibility. I suggested that the service be able to handle a set of data (1+) and that was overturned in favor of doing it one at a time.
I'm really not familiar with EF's limitations, but not being able to handle sets sounds dubious to me. I'll investigate EF in more detail after I get this busy period sorted out.
Caveat 2: The development team wants the future flexibility to be able to query the reporting table to look at the changes right after the update, so waiting 1-12 seconds is not acceptable.
I agree with their worry that everyone is going to have to remember that it could take up to 12 seconds and write logic around that.
I also think having a deadlock every 10 seconds is not acceptable either.
Where I am now: I'm going back to TABLOCK/PAGLOCK and pouring over deadlock XML. I'm not sure why TABLOCK fixed 80% of the deadlocks and not the remaining percentage. I'm trying to find out what is different.
The only way I know how to handle deadlocks is reduce the frequency and footprint of each lock.
Fewer locks is a smaller chance of a deadlock.
Smaller footprint means locks are more likely to miss each other.
A SET is my top solution for this, but apparently it's not an option.
My backup would be to try to optimize the querys to reduce overall resources and time. This is a barebones delete and barebones insert. In both cases the temp table is joining with reporting table to figure out which records need to be deleted and which need to be added.
What I would love is to just have it wait. Just have it queue up in a wait and wait for the table to become available. I would love to be able to insert and delete records from the table. They really don't conflict. I'm really not deleting records that might be in use. I'm not even updating records, so there is no worry about a record being updated out from underneath me.
So... any advice I would love to hear.
It might be my inexperience and I know the Dev team believes they're doing their best, but it really feels like I have people telling me that performance in the database shouldn't come into their planning, but they expect optimized results from the database.

Viewing 8 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply

CarpenterBob Ten Centuries Points: 1026 More actions · Answer 1

CarpenterBob

Ten Centuries

Points: 1026

June 23, 2011 at 10:39 am

#1343228

How about READ UNCOMMITED to avoid locks...

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 2

CarpenterBob (6/23/2011)
How about READ UNCOMMITED to avoid locks...

Caveat 6-7-8-9.

Bad data

Connection being destroyed on super severe errors

management makes wrong decision based on bad data

Need to find another job because of the wrong suggestion.

P.S. READ UNCOMMITED is STILL locked on a delete.

CarpenterBob Ten Centuries Points: 1026 More actions · Answer 3

CarpenterBob

Ten Centuries

Points: 1026

June 23, 2011 at 10:57 am

#1343241

ah, thanks for the enlightenment Ninja's_RGR'us!!!

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 4

CarpenterBob (6/23/2011)
ah, thanks for the enlightenment Ninja's_RGR'us!!!

HTH. very common misconseption about that ermm "feature".

The only real use I've seen of it is for myself when I need to test queries during a dev phase where correct data is not required and the server is locked the table for some other test running. But that's about it.

ShawnTherrien Default port Points: 1481 More actions · Answer 5

I did attempt a read uncommitted transaction around the statements, but that didn't help, it still deadlocked on heavy testing.

Then I did some research on ISOLATION LEVELS. Found a guy who works for MS (I read the page at home, don't have it handy) who outlined why you don't want to use NOREAD or READ UNCOMMITTED, due to the data corruption and stability issues. Basically the only time these were fine to use was in reporting (only selecting) and when exacting detail wasn't a requirement, such as for graphs and trending that tended to round up on numbers.

I ended up trying to reduce the footprint of the DELETE/INSERT. It was attached to a view we materialized due to it's poor performance. Better overall for the system, but we're attaching that poor performance for the view to CRUD operations on the reporting table. In the future we would have better performance, but the view is taking the hit up front, with every employee updated.

The CPU, Memory and overall time for the queries was fine, but the IO was shockingly bad. I started with 35,000 logical reads for a single employee and was down to 2,000 logical reads by the time I was finishing up for the night.

I ran the same tests - 50 changes in a query window, 4 windows. Previously 3 would deadlock without fail within 15 seconds leaving the last to continue running.

With the changes made, I didn't get a single deadlock. I pushed it up to 8 windows and still no deadlocks.

We're doing further stress testing trying to push the updates to their limits, but it looks like it's simmered down by reducing the queries footprint.

I'm really not sure how I would have handled this if I couldn't reduce the footprint of the queries being run.

Gail Shaw SSC Guru Points: 1004514 More actions · Answer 6

CarpenterBob (6/23/2011)
How about READ UNCOMMITED to avoid locks...

How about one of the snapshot isolation levels to avoid locks other than write locks?

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

ShawnTherrien Default port Points: 1481 More actions · Answer 7

Thanks 🙂

I'm running READ COMMITTED SNAPSHOT.

The issue is only in writes. A lot of deletes and inserts on a single table.

Things have been better since I've optimized the indexed view that sits on the table