Primary key violation on INSERT, even though TABLOCKX is held

  • Hi

    I have an issue where a stored procedure runs the following code and a PK violation is reported (although I cannot reproduce this issue).

    BEGIN TRANSACTION

    SELECT TOP (0) 1 FROM table1 WITH (TABLOCKX, HOLDLOCK)

    UPDATE table1
    SET table1.column1 = l.staging_2,
    table1.column2 = l.staging_3
    FROM
    table1 t1
    JOIN
    @staging_table_var l
    ON (t1.pk_column = l.staging)

    INSERT INTO table1
    (pk_column, column1, column2)
    SELECT
    staging, staging_2, staging_3
    FROM
    @staging_table_var l
    WHERE
    NOT EXISTS (SELECT 1 FROM table1 t1 WHERE t1.pk_column = l.staging)

    COMMIT TRANSACTION

    The staging table variable does not contain duplicate  id's. This stored procedure might get called by multiple processes simultaneously, and some of those processes might contain duplicate id's *across* them, i.e. Process1 and Process2 both call this proc and the staging table for each process contains ID 1. I would expect that in this case, the TABLOCKX would prevent other processes from accessing the table and therefore Process1 might get to INSERT and Process2 gets to UPDATE once Process1 releases the TABLOCKX.

    The staging table is defined as 3 VARCHAR(4000) columns and the destination table is defined with a BIGINT primary key, a VARCHAR(32) for column1 and a TEXT column for column2. I don't think that plays into it, but mentioning it here in case it does somehow.

    What am I missing?

  • Sounds like a race condition on insert.
    I would forget about the explicit transaction and run the following:

    UPDATE t1
    SET table1.column1 = l.staging_2,
    table1.column2 = l.staging_3
    --Putting an UPDLOCK here may help avoid deadlocks
    FROM table1 t1 -- WITH (UPDLOCK)
        JOIN @staging_table_var l
            ON t1.pk_column = l.staging;

    INSERT INTO table1(pk_column, column1, column2)
    SELECT staging, staging_2, staging_3
    FROM @staging_table_var l
    WHERE NOT EXISTS
    (
        SELECT 1
        FROM table1 t1 WITH (UPDLOCK, SERIALIZABLE)
        WHERE t1.pk_column = l.staging
    );

  • Hi Ken. Thanks for the suggestion. I read a nice article about a cleaner way of doing the upsert using a pattern similar to what you described. The article: http://weblogs.sqlteam.com/dang/archive/2007/10/28/Conditional-INSERTUPDATE-Race-Condition.aspx

    What confuses me though, is that I would have thought the TABLOCKX is a more "sledgehammer" approach (less elegant) that should make it impossible to run into PK violations due to only a single SPID being able to access the table at a time. If only 1 connection can touch the table, the INSERT/UPDATE logic should be free of race conditions. I would like to understand how the race condition occurs with the TABLOCKX being held for the duration of the upsert.

  • Jon 0x7f800000 - Monday, March 6, 2017 9:47 AM

    Hi Ken. Thanks for the suggestion. I read a nice article about a cleaner way of doing the upsert using a pattern similar to what you described. The article: http://weblogs.sqlteam.com/dang/archive/2007/10/28/Conditional-INSERTUPDATE-Race-Condition.aspx

    What confuses me though, is that I would have thought the TABLOCKX is a more "sledgehammer" approach (less elegant) that should make it impossible to run into PK violations due to only a single SPID being able to access the table at a time. If only 1 connection can touch the table, the INSERT/UPDATE logic should be free of race conditions. I would like to understand how the race condition occurs with the TABLOCKX being held for the duration of the upsert.

    1) You have a potentially HUGE difference in your code and the referenced link: you have an explicit TOP(0) in your SELECT statement. If "I" were writing an optimizer I would certainly consider a shortcut for that one that would take NO locks on the table other than any required shared locks if any (schema stability maybe). I don't have time to test, but you could just run your query through the SELECT TOP(0) and see what locks are taken. 

    2) Just to make certain, you really want to avoid MERGE for this. VERY buggy, and it does suffer from the same race condition and requires a HOLDLOCK as well.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Hi Kevin. I was also paranoid about the optimizer ignoring the SELECT. I checked the locks acquired in a SQL Profiler trace and right in-between the StmtStarting and StmtCompleted of the SELECT TOP (0) [...] there is a Lock:Acquired event, with Mode "X", Type "Object" - so at least on SQL Server 2008 R2 SP1, it does seem to take out the exclusive table lock.

  • Jon 0x7f800000 - Monday, March 6, 2017 10:21 AM

    Hi Kevin. I was also paranoid about the optimizer ignoring the SELECT. I checked the locks acquired in a SQL Profiler trace and right in-between the StmtStarting and StmtCompleted of the SELECT TOP (0) [...] there is a Lock:Acquired event, with Mode "X", Type "Object" - so at least on SQL Server 2008 R2 SP1, it does seem to take out the exclusive table lock.

    Worth a shot. Thanks for replying back.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • I wouldn't resort to (or rely upon) table locking or serializable isolation level for something like this. It could inadvertently block other processes that only need read access, and at the end of the day it's probably not going to reliably do what you need it to do anyhow. Perhaps a better way to prevent two sessions of a process from running concurrently is to leverage sp_getapplock and sp_releasapplock.
    https://www.mssqltips.com/sqlservertip/3202/prevent-multiple-users-from-running-the-same-sql-server-stored-procedure-at-the-same-time/

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Jon 0x7f800000 - Monday, March 6, 2017 9:47 AM

    What confuses me though

    It confuses me as well! I find the SQL Server documentation very poor in this regard.
    My experience is that

    FROM YourTable WITH (UPDLOCK, SERIALIZABLE)

    in the exists part of the insert is reliable.
    My understanding of the documentation is that WITH (SERIALIZABLE) should be enough but the UPDLOCK is also required. I have seen some posts which suggest that the UPDLOCK is required to stop deadlocks but I am not totally convinced.

    Application Locks are also a good trick if you expect a lot more inserts than updates but you will still need to use WITH (UPDLOCK, SERIALIZABLE) if another part of your system has the potential to merge at the same time.

  • Jon 0x7f800000 - Monday, March 6, 2017 8:53 AM

    Hi

    I have an issue where a stored procedure runs the following code and a PK violation is reported (although I cannot reproduce this issue).

    BEGIN TRANSACTION

    SELECT TOP (0) 1 FROM table1 WITH (TABLOCKX, HOLDLOCK)

    UPDATE table1
    SET table1.column1 = l.staging_2,
    table1.column2 = l.staging_3
    FROM
    table1 t1
    JOIN
    @staging_table_var l
    ON (t1.pk_column = l.staging)

    INSERT INTO table1
    (pk_column, column1, column2)
    SELECT
    staging, staging_2, staging_3
    FROM
    @staging_table_var l
    WHERE
    NOT EXISTS (SELECT 1 FROM table1 t1 WHERE t1.pk_column = l.staging)

    COMMIT TRANSACTION

    The staging table variable does not contain duplicate  id's. This stored procedure might get called by multiple processes simultaneously, and some of those processes might contain duplicate id's *across* them, i.e. Process1 and Process2 both call this proc and the staging table for each process contains ID 1. I would expect that in this case, the TABLOCKX would prevent other processes from accessing the table and therefore Process1 might get to INSERT and Process2 gets to UPDATE once Process1 releases the TABLOCKX.

    The staging table is defined as 3 VARCHAR(4000) columns and the destination table is defined with a BIGINT primary key, a VARCHAR(32) for column1 and a TEXT column for column2. I don't think that plays into it, but mentioning it here in case it does somehow.

    What am I missing?

    Does table @staging_table_var have a PK on column [staging]?

    _____________
    Code for TallyGenerator

  • ...and how many rows are in the staging "table"?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Sergiy - Sunday, March 19, 2017 8:06 AM

    Does table @staging_table_var have a PK on column [staging]?

    Jeff Moden - Sunday, March 19, 2017 9:06 AM

    ...and how many rows are in the staging "table"?

    The staging table variable does not have a PK - I am awaiting some results to confirm that there are no duplicates in the staging table. Good point about a PK though - it would have made it more immediately clear if there were duplicates from the beginning.

    There are about 140 rows in the staging table, never more than 200 let's say.

  • Although off topic, in SQL 2008 the query optomiser always assumes 1 record in @staging_table_var.  This is because SQL does not maintain stats for @Table variables.

    If you change to a #Temp table, you will most likely get better performance.

  • OK, I have confirmed that there are no duplicates in the staging table at any point. So right now there is actually no reasonable explanation for the PK violation. There must be something weird on the production system. Any ideas for possible culprits? I have checked that there are no triggers on the table. Are there circumstances where TABLOCKX would allow another connection to insert a row into the table? That seems to contradict the meaning of "exclusive table lock". There is potentially code running in the production environment that also modifies this table, which I have no control over - but that's where a TABLOCKX should keep them out.

    I don't want to go down the path of trying UPDLOCK as previously suggested until I understand why TABLOCKX is not working. Performance is not an issue - this stored proc is not a bottleneck.

  • Jon 0x7f800000 - Monday, March 6, 2017 8:53 AM

    The staging table is defined as 3 VARCHAR(4000) columns and the destination table is defined with a BIGINT primary key,

    This could be a part of the problem.
    Empty space, '0' and '000' are different literal constants, but they are converted to the same integer one.
    You need to make sure you have unique values of the same data type as the destination table.
    And (just to make sure) - did you actually include the PK definition into the staging table declaration within your script?

    _____________
    Code for TallyGenerator

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply