Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 12»»

SERIALIZABLE Transaction Isolation Level Is Not Behaving As I Expected Expand / Collapse
Author
Message
Posted Friday, February 15, 2013 3:18 PM


SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Monday, November 24, 2014 2:15 PM
Points: 258, Visits: 1,103
I have 2 goals to accomplish:
1. Continuously and atomically migrate data from a primary table to an archive table.
2. Ensure that a VIEW of a UNION ALL between the primary and the archive tables never shows duplicate rows.

Current results:
1. Code working correctly.
2. Code NOT working correctly.

This is a subset of the data migration query executed continuously as a SQL Agent job. AUDITMSG is the primary table. OLDAUDIT is the archive table:
DECLARE @DayCount  INT = 90;
DECLARE @DayCutOff DATETIME = DATEADD(day, -@DayCount, GETDATE());
DECLARE @TempTable TABLE (ID INT NOT NULL);

BEGIN TRY
WHILE(1 = 1)
BEGIN
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;

BEGIN TRANSACTION

INSERT INTO @TempTable
SELECT TOP 10 ID FROM AUDITMSG WHERE TRANS_DATE < @DayCutoff;

INSERT INTO OLDAUDIT( ORIGINAL_ID, Vendorid )
SELECT ID, Vendorid FROM AUDITMSG WHERE ID IN (SELECT ID FROM @TempTable);

DELETE FROM AUDITMSG WHERE ID IN (SELECT ID FROM @TempTable);

COMMIT TRANSACTION;

DELETE FROM @TempTable;
WAITFOR DELAY '00:00:01.000';
END
END TRY
BEGIN CATCH
DECLARE @ErrorMessage VARCHAR(255);
SET @ErrorMessage = ERROR_MESSAGE();
ROLLBACK TRANSACTION;
RAISERROR(@ErrorMessage, 20, 0) WITH LOG;
RETURN;
END CATCH

Below is a test query designed to prove that duplicate rows do not exist in either table:
SELECT COUNT(*) 
FROM AUDITMSG a
INNER JOIN OLDAUDIT o ON o.ORIGINAL_ID = a.ID

When the SQL Agent job is not executing, the result returned is always zero rows, which means no duplicate rows were found in either table.

When the SQL Agent job is executing, the result returned is 30, 40, or 50 rows, which indicates that the rows exist simultaneously in both tables and the VIEW of both tables, AUDITMSG and OLDAUDIT, can show duplicates.

This is not the way I expected the SERIALIZABLE transaction isolation level to work.

From Books Online:

SERIALIZABLE - Specifies the following:

Statements cannot read data that has been modified but not yet committed by other transactions.

No other transactions can modify data that has been read by the current transaction until the current transaction completes.

Other transactions cannot insert new rows with key values that would fall in the range of keys read by any statements in the current transaction until the current transaction completes.

Range locks are placed in the range of key values that match the search conditions of each statement executed in a transaction. This blocks other transactions from updating or inserting any rows that would qualify for any of the statements executed by the current transaction. This means that if any of the statements in a transaction are executed a second time, they will read the same set of rows. The range locks are held until the transaction completes. This is the most restrictive of the isolation levels because it locks entire ranges of keys and holds the locks until the transaction completes. Because concurrency is lower, use this option only when necessary. This option has the same effect as setting HOLDLOCK on all tables in all SELECT statements in a transaction.


What have I overlooked?

What do I need to change to prevent the VIEW'ing of duplicate rows?
Post #1420802
Posted Monday, February 18, 2013 9:35 AM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Tuesday, January 21, 2014 8:29 AM
Points: 65, Visits: 66
Hi Gail,

In what isolation level is the query that checks the results running? Do you have Read Committed Snapshot on?

Also, have you considered using the DELETE...OUTPUT clause to capture the results that should be inserted into the new table, saving you first having to SELECT into your table variable?

http://msdn.microsoft.com/en-gb/library/ms189835.aspx#CaptureResults

Jamie



Post #1421276
Posted Monday, February 18, 2013 6:39 PM


SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Monday, November 24, 2014 2:15 PM
Points: 258, Visits: 1,103
HunterJ (2/18/2013)
Hi Gail,

In what isolation level is the query that checks the results running? Do you have Read Committed Snapshot on?

Also, have you considered using the DELETE...OUTPUT clause to capture the results that should be inserted into the new table, saving you first having to SELECT into your table variable?

http://msdn.microsoft.com/en-gb/library/ms189835.aspx#CaptureResults

Jamie


Thanks for responding, Jamie.

The query that checks the results is operating at the READCOMMITTED isolation level. And, in fact, the data migration query I wrote was originally operating at the same level which should be completely adequate. I escalated the isolation level to SERIALIZABLE just to see if it would fix the problem.

I have considered the DELETE...OUTPUT clause but I think what I've already written should work. I want to understand what the problem is before moving on to another way of implementing the code.
Post #1421399
Posted Tuesday, February 19, 2013 1:47 AM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Friday, December 5, 2014 11:12 AM
Points: 58, Visits: 547
Correct me if I am wrong here.. looks like you are trying to archive data.

How big is the data size in the table?
If the data size is significantly big, why don't you use partitioning? I see in your code, you are going against date to move data. You can use that Date Column as a partitioning column and deal with the data movement once a month/quateryly(depending on your requirement and/or archival strategy).

This way you can avoid Isolation levels from the picture. I am not against isolation levels but, I would avoid them as much as I can in my system unless until necessary.

Thanks
Jagan K


Thanks
Jagan K
Post #1421476
Posted Tuesday, February 19, 2013 2:32 AM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Tuesday, January 21, 2014 8:29 AM
Points: 65, Visits: 66
Thanks for confirming that - was hoping that might have been READ UNCOMMITTED :)

As you would expect, as soon as you insert the data in your audit table the check query should block until the transaction is committed. Even with snapshot read committed I'd still expect to see a consistent set of results as per the uncommitted state and no blocking. Every test i have been able to perform has behaved as expected.

Its unlikely, but could it be a timing thing, that the initial RangeS-S KEY locks are not enough...maybe try adding a hint to force taking an exclusive lock, or be more aggressive with more restrictive locks? Not recommending that as a solution though :)

Are you getting errors from the catch statement? table variables arent effected by transactions, so theres an error scenario where the data being inserted/delete might not have the Range-S locks in place, although I still dont really see whats going on.

It could be your queries are using different a locking strategy and what you are seeing are the side effect known as phantom rows which can happen when reading from a table at anything other than SERIALIZABLE isolation levels.

It might be worth trying to capture the locks on both sessions to see whats happening.

Sorry I cant be more definitive.

Jamie



Post #1421488
Posted Tuesday, February 19, 2013 2:38 AM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Tuesday, January 21, 2014 8:29 AM
Points: 65, Visits: 66
jvkondapalli (2/19/2013)
Correct me if I am wrong here.. looks like you are trying to archive data.

How big is the data size in the table?
If the data size is significantly big, why don't you use partitioning? I see in your code, you are going against date to move data. You can use that Date Column as a partitioning column and deal with the data movement once a month/quateryly(depending on your requirement and/or archival strategy).

This way you can avoid Isolation levels from the picture. I am not against isolation levels but, I would avoid them as much as I can in my system unless until necessary.

Thanks
Jagan K


I think the question was more focused on explaining what she is seeing, rather than discussing other solutions (further emphasised by Gails response to my suggestion for an alternative solution :)). Also, partitioning is only available on Enterprise editions, this may not be available.

Jamie



Post #1421492
Posted Tuesday, February 19, 2013 2:54 AM


SSChasing Mays

SSChasing MaysSSChasing MaysSSChasing MaysSSChasing MaysSSChasing MaysSSChasing MaysSSChasing MaysSSChasing Mays

Group: General Forum Members
Last Login: Friday, December 12, 2014 12:19 AM
Points: 647, Visits: 1,311
Gail,

Apart from forcing a hint, I would like you revisit the TOP clause. I have seen TOP behaving unexpectedly particularly when you dont have an order by. Its is a wild guess, but please let us know if you got some breakthrough


-- Roshan Joe
*******************************************
There are two types of DBAs. Those who has skills and those who have permissions
Jeff Moden -Forum Etiquette: How to post data/code on a forum to get the best help
Custom cleanup script for backups
Post #1421499
Posted Tuesday, February 19, 2013 12:29 PM


SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Monday, November 24, 2014 2:15 PM
Points: 258, Visits: 1,103
jvkondapalli,

I am archiving data. Currently, the AUDITMSG table contains 17,303,002 rows. The OLDAUDIT table contains 69,227,085 rows. The tables are not partitioned but use different file groups. The data archiving was previously implemented to execute several times nightly and non-atomically in increments of 100,000 rows, before I was hired. This was causing table locks and reports executing in the middle of the night were timing out. I have implemented the archiving as a "PACMAN" style job that executes continuously, around the clock, in small "nibbles" of 10 rows per second, so no one is impacted. I have fallen back to using the READCOMMITTED isolation level which should be, and should have been, adequate in the first place.

Jamie,

I asked a DBA friend of mine to try to duplicate the problem. He couldn't. In testing to eliminate the problem I'm seeing, I implemented HOLDLOCKs hints for every step of the transaction. They had no effect at all. That's why I escalated the entire transaction to SERIALIZABLE, to try to eliminate the problem. I'm getting no errors in the CATCH statement. The query executed all weekend on our production server and worked perfectly, except that I know that there is a small window of time where the rows can be seen in both tables, even with READCOMMITTED isolation. I am going to change the TABLE variable to a TempDB table and see if it fixes the problem. I hadn't considered that table variables might be treated differently.

joeroshan,

I appreciate your suggestion on the TOP clause. I could use an ORDER BY clause to force an order but that would be very expensive. I've used TOP in similar queries with success. The main thing is that I need that part of the operation to read and lock 10 rows that can be escalated to exclusive locks to perform the deletions. So any 10 rows that meet that criterion will do.

Thank you all for your responses and ideas. I'm going to try some more tests with your ideas and I'll respost, even if I have no success.
Post #1421806
Posted Tuesday, February 19, 2013 2:08 PM


SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Monday, November 24, 2014 2:15 PM
Points: 258, Visits: 1,103
I substituted a TempDB table for the TABLE variable. No change.

Post #1421847
Posted Wednesday, February 20, 2013 1:02 PM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Tuesday, January 21, 2014 8:29 AM
Points: 65, Visits: 66
Given the rare occurrences and not being able to reproduce it either by myself nor by your DBA friend, I am only able to suggest it is a side effect off phantom rows which can exists out with of SERIALIZABLE - both the archive and the checker need to work in this isolation level. It is after all why this isolation level exists.

I think it would be fairly involved to capture enough system data and to analyse it to work out exactly whats happening, I'd certainly be interested in hearing your results if you go down this path!

Personally, I'd reimplement the solution (using DELETE...OUTPUT or another suggested solution - whichever is best for your solution) to try and minimise any time spent within the transaction, before forcing such a high level isolation level.

I guess you could also try using PAGE locks rather than ROW locks, and up the number of rows you archive at a time. It might be more aggressive, but it could give you the room you need to work at a lower isolation level.

Jamie



Post #1422264
« Prev Topic | Next Topic »

Add to briefcase 12»»

Permissions Expand / Collapse