When is it safe to use NOLOCK?

Question

Post reply

When is it safe to use NOLOCK?

Kristen-173977

SSCrazy Eights

Points: 8687
More actions
August 27, 2015 at 5:01 pm

#307225

We never use NOLOCK in our APPs, but I see it used like confetti in 3rd party apps installed on our servers, and it always worries me, but I wonder if my worry is unfounded and what scenarios are "safe" such that I can ignore them?
Having said we "never" use NOLOCK we do use it freely in DBA reports which query APP tables - so that we don't interfere with what users are doing when we check things. But the DBAs know in what ways the data is not "safe" and can / cannot be trusted for. Its just that we don't include NOLOCK in ANY App Code which we write for our users.
I've been looking at an example in a 3rd party APP today and my blood is boiling ... but if you learned folk tell me it is safe I'll cut vendors some slack in future.
Consider this example:
SELECT ... some columns ...
FROM WellKnownDocumentManagementSystemDocumentsTable AS DOC
WHERE ...
AND NOT EXISTS
(
SELECT*
FROMDenyDocumentToUser (NOLOCK)
WHEREDOCNUM = DOC.DOCNUM
AND DOCUSER = @CurrentUserID
)
Let's assume that I am not allowed to view that document because its DOCNUM is in the Deny Documents To User table associated with my User ID.
Normally the NOT EXISTS would return FALSE and I would never know that the document exists
But what if the DenyDocumentToUser table's index was being split as I made my query, and I only read the revised half-page index page, and the newly inserted index overflow page hadn't yet been created, so I didn't read that at all. Thus I never saw the relevant index entry at all, and as a consequence the NOT EXISTS returned TRUE. Of course if I repeated the exercise a couple of milliseconds, or more, later than my query would see all the index pages, as no [relevant] pages were being split at the time, and the document would be blocked from me again.
Is this a scenario that could, however rarely, occur?
(Its the principle I'm after, rather than this specific example, so if my example has over trivialised it such that it would never fail, but a similar scenario could fail, I'd be very grateful to hear what the subtleties are so I can moderate my blood boiling point temperature accordingly!)
Many thanks

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply

pietlinden SSC Guru Points: 63428 More actions · Answer 1

Watch Kendra Little's video[/url]...

See #6. and the video that goes with it.

She and Brent Ozar know more about SQL Server than I probably ever will. And they know what they're talking about.

Eirikur Eiriksson SSC Guru Points: 182882 More actions · Answer 2

Spent several days once debugging a system and the bug turned out tobe caused by nolock hints.

😎

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 3

Kristen-173977 (8/27/2015)
Is this a scenario that could, however rarely, occur?

In theory, yes. Very unlikely, and very dependant on what indexes exist (the dup/missing rows can happen only on scans), but in theory it's possible

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Kristen-173977 SSCrazy Eights Points: 8687 More actions · Answer 4

Thanks Gail. I have a real problem with "very unlikely" because my perception is that Business Users will make company-critical decisions on data that they see on their screen. They will assume that it is correct (unless it is wildly inaccurate, always inaccurate, or they are having a good day and, Eagle-eyed, spot a piece of goofy data)

So a once-in-a-Blue-Moon event when some rows are missing (or accidentally included twice in a financial's total) is 100% totally unacceptable to me. But I would be very interested to hear if others take a different view on that?? Thanks 🙂

Without doubt there will be an Invoice sitting in Accounts In-Tray that they are going to post in 2-minute's time, and that is not, yet, in the accounts total that I am looking at ...

... but, data which has been in the database for weeks ... months ... years even ... SUDDENLY missing from a report, and on REFRESH it is back again (maybe, or maybe something else is now missing / doubled-up) I don't think I should allow in the system at all because our users are absolutely NOT in tune with expecting that.

The users know about the not-yet-posted-invoice pile of today's post in the Accounts Department, and they know about the Goofy data - in the sense of whether that is rare, or common, in their organisation. They, for absolute certain, are not looking out for occasionally-missing-data - even if I told then they should. And if I told the Business Manager that she should beware of it she'd say "Go away and fix it" (I know that because I've had that conversation with her!!)

You'll gather that I feel pretty strongly about this one! but I do need to get a reality check on "just how bad".

The Document Management System has code like:

INSERT INTO AttributeTableUpdateQueue (DocNumber, Attribute, Value, ...) VALUE (1234, 'FileName','MyWordFile.DOC')

INSERT INTO AttributeTableUpdateQueue (DocNumber, Attribute, Value, ...) VALUE (1234, 'FileSize','12345')

There is no BEGIN TRANSACTION in site, which I'd like to fix, but I think "what's the point" because whatever is reading the queue is going to be using NOLOCK so its going to be taking a decision on whatever Attribute records it CAN see at that point-in-time and, I reckon, one day sooner-or-later is going to process a half-set of Document Attributes. If will process the remainder in the next batch-run, so at the end of the day the database integrity will be correct ... but ... that's not the way I want to work.

Kristen-173977 SSCrazy Eights Points: 8687 More actions · Answer 5

pietlinden (8/27/2015)
Watch Kendra Little's video[/url]

Very interesting, I enjoyed watching that, thanks. It reminded me that one of the other outcomes of a NOLOCK index read simultaneous with a page split is that an error is raised. Apart from the annoyance of that, to the user, I'm comfortable with that as at least the user doesn't see any dirty data.

I have tried setting the database to RCSI (notwithstanding that I know that will make no different until the NOLOCK hints are removed) but the performance tests we did were awful.

The Document Management package maintains a hierarchy tree of "folders" / "Projects" and does a lot of shuffling when a folder is inserted / removed from / moved within the tree. The code for that is something like:

UPDATEPROJECTS

SETLEFT_VISIT = LEFT_VISIT - 2

, EDITWHEN = @CurrentDateTime

WHERE TREE_ID = @SomeID

AND LEFT_VISIT > @ACounterValue

This typically updates 20,000 rows, runs in a tight RBAR loop (overnight we get these running continuous for 10 hours or so during Batch updates of the Folder/Project Tree:( ). When I tried RCSI the number of iterations fell from an already pathetic [nothing else running on the server] 14-per-minute to around 1 🙁 (14 non-clustered indexes on this table, the Clustered Index INCLUDES the LEFT_VISIT column being modified by this update ... I'm looking in all that, but its within the APP code so I'm not sure I will be able to change it - although I might well try changing the Clustered Index to something that doesn't get changed 20,000 per update, with continuous updates for 10 hours! We aren't running Full Recovery on that database, for that reason, and that bothers me too for Disaster Recovery

Kristen-173977 SSCrazy Eights Points: 8687 More actions · Answer 6

GilaMonster (8/28/2015)
Kristen-173977 (8/27/2015)
Is this a scenario that could, however rarely, occur?
(the dup/missing rows can happen only on scans)

I need to look into this further, but its a very helpful piece of info (which I didn't realise), many thanks.

The code isn't as simple as the example I have posted, and instead of the NOT EXISTS I have shown there is quite a lot of

DOC.SomeAttribute IN

(

SELECT SomeID

FROM SomeTable

WERE SomeUserCredential = 'X'

...

and generally "= 'X' " also includes an OR with a further nested sub-select, often several levels deep, so, sadly, I think there is a racing-chance that the Optimiser decides on a SCAN somewhere along the way 🙁

I will keep that in mind from now on when I'm pouring over the Query Plans 🙂