Excessive Blocking despite Read Committed Snapshot Isolation

  • I've got a DynamicsAX database that's seeing a significant amount of blocking despite having RCSI set and the connections using Read Committed.

    This Microsoft article ( https://docs.microsoft.com/en-us/sql/t-sql/statements/set-transaction-isolation-level-transact-sql?view=sql-server-ver15&viewFallbackFrom=sql-server-ver12 )  confirms that there should be far less blocking, under the READ COMMITTED section it says: If READ_COMMITTED_SNAPSHOT is set to ON (the default on Azure SQL Database), the Database Engine uses row versioning to present each statement with a transactionally consistent snapshot of the data as it existed at the start of the statement. Locks are not used to protect the data from updates by other transactions.

    So why am I seeing a lot of long running blocking? Could it be that somewhere in the Dynamics process flow the app is taking out a specific lock against some objects?

    An sp_lock on the blocking SPID shows me this:

    DynamicsAK Locks

    As far as I can see the UPDSTAS lock is from auto update stats against a highly volatile but small table. I've net yet captured the specific object the blocked thread is waiting on.

    Leo

    Leo
    Nothing in life is ever so complicated that with a little work it can't be made more complicated.

  • I don't have the Azure version, but you could capture blocking events for review with Extended-Events on Azure SQL.

    It is briefly mentioned in https://docs.microsoft.com/en-us/azure/azure-sql/database/understand-resolve-blocking

  • There's nothing magical about snapshot isolation. It helps with reads and blocking. It doesn't help with writes and blocking. If more than one person is attempting to update a given page, they still have to wait for access to that page until the first person completes their update.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Leo.Miller wrote:

    I've got a DynamicsAX database that's seeing a significant amount of blocking despite having RCSI set and the connections using Read Committed.

    Aren't most of the Clustered Indexes in DynamicsAX based on Random GUIDs?  If so, let me know because I actually have a surprisingly easy fix for such problems.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • No, at least not WRT the tables where I'm seeing blocking.  Most of the clustered indexes are now PARTITIONID, DATAAREAID, TableSpecificID. Typically bigint, nvarchar, bigint.

    One issue I have is that for most clients I've worked with, for any one company, the first two columns have a cardinality of 1. So you only see any variance in the 3rd column. RECID (bigint) seems to be unique, but it's no longer being used for clustering.

    Leo
    Nothing in life is ever so complicated that with a little work it can't be made more complicated.

  • A follow on of what I have found. There are a lot of SELECT....with (UPDLOCK) commands. Since the database is already set for Read Committed Snapshot Isolation (to reduce deadlocks, etc.). I assume this is being used to prevent multiple users updating records under each other. The idea being to stop double selling of stock. The problem is, if two connections (users) use this hint in transactions, you see blocking, even if RCSI is enabled.

    At the moment it looks like the heavy use of the UPDLOCK hint with the amount of time users take to enter and process orders is causing the extensive blocking. The problem now is to persuade the DynamicsAX developers to remove the locking hint, or to do a first read, determine the stock level and if it's below some threshold resubmit the select with the locking hint. Locking an item record just in case it may get double sold when there's more than enough stock is incredibly inefficient, and they have users sitting for 20 minutes plus waiting for their queries to complete. Rather read the data twice on those few occasions when it's a risk not to lock the record.

    Are there any other good options? Anyone know DynamicsAX who could help here?

     

    Leo
    Nothing in life is ever so complicated that with a little work it can't be made more complicated.

  • Yikes.

    Nope. I'm not going to be much help here with DynamicsAX.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Also, be aware RCSI has a 14 byte per row added overhead.  If there isn't space available in the existing data pages, you can likely expect a lot of page splits from adding that 14 bytes, which will also slow down processing.

    Hold as few a rows as possible with UPDLOCK.  For example, if you have to scan thru several rows to identity a single row to hold for someone, do the scan with a simple SELECT and only do the UPDLOCK on the single row they really need to hold, even if  you have to re-read just that one row.  Again, if that's possible to do.

     

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • Leo.Miller wrote:

    A follow on of what I have found. There are a lot of SELECT....with (UPDLOCK) commands. Since the database is already set for Read Committed Snapshot Isolation (to reduce deadlocks, etc.). I assume this is being used to prevent multiple users updating records under each other. The idea being to stop double selling of stock. The problem is, if two connections (users) use this hint in transactions, you see blocking, even if RCSI is enabled.

    At the moment it looks like the heavy use of the UPDLOCK hint with the amount of time users take to enter and process orders is causing the extensive blocking. The problem now is to persuade the DynamicsAX developers to remove the locking hint, or to do a first read, determine the stock level and if it's below some threshold resubmit the select with the locking hint. Locking an item record just in case it may get double sold when there's more than enough stock is incredibly inefficient, and they have users sitting for 20 minutes plus waiting for their queries to complete. Rather read the data twice on those few occasions when it's a risk not to lock the record.

    Are there any other good options? Anyone know DynamicsAX who could help here?

    I have never used Dynamics AX but this looks like pessimistic concurrency control. It seems this is configurable in Dynamics AX so it may be worth experimenting with optimistic concurrency control in a test environment.

    https://docs.microsoft.com/en-us/dynamicsax-2012/developer/optimistic-concurrency-control

     

  • If I am reading the documentation correctly - switching to OCC in DynamicsAX changes the UPDLOCK to NOLOCK.  If that is the case, then I wouldn't make that switch - I would much rather have blocking than invalid/incorrect data.

    This assumption could be wrong - I found this: https://cloudblogs.microsoft.com/dynamics365/no-audience/2009/07/08/about-locking-and-blocking-in-dynamics-ax-and-how-to-prevent-it/

     

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • That is strange.

    I expect OCC not to have any lock hints for the initial SELECT especially with RCSI. You could try OCC in a test environment to see if any NOLOCKS are produced. If they are I would raise a support call to ask why. It may be that the ROWVERSION column is enough to stop problems with phantom reads and the table is structured in a way to stop 601 (data movement) errors etc

    Like you, I think NOLOCK here is questionable but would be interested in knowing if Microsoft think differently.

  • Thanks Ken for pointing me to this article https://docs.microsoft.com/en-us/dynamicsax-2012/developer/optimistic-concurrency-control

    Close reading indicates that while you can have Global Optimistic Concurrency Control, it is possible to set Pessimistic Control at a table level either by modifying the table property OCC definition in Dynamics or by adding options to the pseudo code related to the relevant tables. With this information I've been able to show that only a specific range of tables have PCC occurring, and these show up in the blocked query list. What was muddying the waters was the fact that the "head of the blocking chain" is a SPID, not a query, so the query currently returned in the blocking chain command may have nothing to do with the query that actually caused the blocking.

    To use a sequence of commands:

    SPID 99 > Select top 1 * from ABC with  (UPDLOCK)     -- runs in under a second so doesn't show up in long running queries and isn't blocked

    SPID 99 > Select * from XYZ where col1 = "some condition derived from the first query" -- runs long or waits for user input once completed

    SPID 110 > delete from ABC where "some condition"   -- This query will now be blocked but the head of the blocking chain actually shows as the second of 99's queries, not the first! I'm not aware that there is any way to change this to show the blocking command, not the last command executed by the blocking SPID.

    I've sent the Dynamics developers back to look at all the tables I've identified to see how the table level OCC has been configured.

    Cheers, I hope this helps someone else one day.

     

     

    Leo
    Nothing in life is ever so complicated that with a little work it can't be made more complicated.

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply