NOLOCK Hint Corrupts Results from SELECT

  • Good day fellow SQL Server Enthusiasts,

    I'm having an issue with the NOLOCK query hint causing erroneous joins. In a nutshell, I'm running a select statement on a table that joins to a couple other tables (one table has about 9M+ rows, the other two are small). I can guarantee that no other process is running on the server, and the tables are already fully committed. When I add the NOLOCK hint, the query literally joins rows that are not equal, and returns the wrong results.

    If I run the same query with the NOLOCK hint, the result set is fine.

    I know there's lots of talk around not using the NOLOCK hint, but I thought it was okay if none of the tables being used in the SELECT statement are affected by other DML statements. Indeed, I realize that if no other process is running then there's no need to consider NOLOCK in the first place, but I stumbled upon this quandary recently, and was wondering if anyone had any insight as to why this happens.

  • This article sounds like it has insights on the issue you're experiencing:

    http://www.sqlservercentral.com/blogs/stratesql/2012/06/18/the-side-effect-of-nolock/

    If my understanding of the dirty reads process is correct, even if all transactions at a given time have been committed, page splits may still occur for a short time afterwards. If you query a table using NOLOCK during that time period and a page split occurs, the data will be shuffled around throughout the table to accomodate the split, and as such, you may end up with data returned from the NOLOCK operation that corresponds to a location in memory that doesn't hold the data you originally wanted. My understanding might be a little off, but the article should be much more succint 🙂

    - 😀

  • Is there never any activity in any of the tables (other than Selects), or is the "window of no activity" short? Like a table loaded by a nightly ETL job, and you running queries on it hours after the job is done. Versus "I checked sp_who2 and there weren't any other active SPIDs while I was running my query", or something comparable to that.

    You could easily be running into an issue with data that hasn't hit a checkpoint yet, if I understand the interactions between checkpoints and NoLock correctly.

    Regardless, simply assume that if you use NoLock, or Read Uncommitted Isolation, that you are courting dirty reads, and assume they will happen regardless of what else you think may or may not be going on in the database when you're running a particular query. It's like speeding when you can't see any cops around: You have to assume you could get caught, but you decide to take the risk anyway because you think it's probably safe "right this moment". Same kind of risk as NoLock.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Langston Montgomery (11/7/2012)

    When I add the NOLOCK hint, the query literally joins rows that are not equal, and returns the wrong results.

    Something else is going on there. If you specified an equi-join, SQL will not suddenly start joining genuinely unequal rows whether NOLOCK is specified or not.

    Is it possible that the join column(s) are being modified? (?)

    Otherwise, please walk back thru the actual columns values and verify this. If you find it is continuing, please post the specific data.

    I'm assuming that you're joining on exact data types and that you're considered things like upper and lower case, collation, etc..

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • Hi Everyone,

    Thank you all for the replies. I can always count on this site for assistance.

    I'm embarrassed to say that it turns out to be a version issue. We ran the same code on a different server with 2008 R2 SP1, and it ran fine. The initial server is still on the RTM version of 2008 R2.

  • Langston Montgomery (11/11/2012)


    I'm embarrassed to say that it turns out to be a version issue. We ran the same code on a different server with 2008 R2 SP1, and it ran fine. The initial server is still on the RTM version of 2008 R2.

    but still ?? how that different version gives diferent/incorrect data ? or you want to say that you have different code/data on both servers

    -------Bhuvnesh----------
    I work only to learn Sql Server...though my company pays me for getting their stuff done;-)

  • Bhuvnesh (11/12/2012)


    Langston Montgomery (11/11/2012)


    I'm embarrassed to say that it turns out to be a version issue. We ran the same code on a different server with 2008 R2 SP1, and it ran fine. The initial server is still on the RTM version of 2008 R2.

    but still ?? how that different version gives diferent/incorrect data ? or you want to say that you have different code/data on both servers

    The code and the data is exactly the same between both servers. One server is production and the other is UAT. We refresh UAT from production every week. On the RTM version of '08 R2, when using the NOLOCK hint, it returns bad rows with values that don't match between fields of an equi-join. After removing the NOLOCK hint, it works fine.

    On the SP1 version of '08 R2, the query returns the correct results regardless of NOLOCK.

    I'm just going to chalk it up as an issue that is corrected with the right version of SQL.

  • That seems a bit strange...can you check the collation between the two servers?

  • richykong (11/12/2012)


    That seems a bit strange...can you check the collation between the two servers?

    I agree, Richy. It's very strange. Here are the collations for each server:

    Production: SQL_Latin1_General_CP1_CI_AS

    UAT: SQL_Latin1_General_CP1_CI_AS

    Database (production): SQL_Latin1_General_CP1_CI_AS

    Database (UAT): SQL_Latin1_General_CP1_CI_AS

    The servers are basically cloned as we deploy everything from UAT to production after final sign off from users. The other thing worthy of note is that the fields of the equi-join that are returning rows even though the values don't match are integers (bigint).

    I had to run this issue by several folks in I.T. We're all stunned and baffled. It works fine on SP1, though.

  • Very strange indeed.

    Try..

    dbcc opentran

    To see if you have any open transactions that you may not even be aware of.

    ----------------------------------------------------

  • Update:

    We upgraded the production server to SP1, and the issue is gone. Lunch for anyone who can figure out why it was happening 🙂

  • hisakimatama (11/8/2012)


    If my understanding of the dirty reads process is correct, even if all transactions at a given time have been committed, page splits may still occur for a short time afterwards. If you query a table using NOLOCK during that time period and a page split occurs, the data will be shuffled around throughout the table to accomodate the split, and as such, you may end up with data returned from the NOLOCK operation that corresponds to a location in memory that doesn't hold the data you originally wanted. My understanding might be a little off, but the article should be much more succint 🙂

    @gsquard , can you please put some light here

    -------Bhuvnesh----------
    I work only to learn Sql Server...though my company pays me for getting their stuff done;-)

  • Langston Montgomery (11/26/2012)


    Update:

    We upgraded the production server to SP1, and the issue is gone. Lunch for anyone who can figure out why it was happening 🙂

    Seems pretty obvious what was happening: you experienced the effects of a bug in the product that was fixed by the SQL Server patch you applied. End of story. Do I get a prize?? :hehe:

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • It may not be a bad idea to have the server and other workstations check apply updates automatically to Windows and applications.

    ----------------------------------------------------

  • mmartin1 (11/28/2012)


    It may not be a bad idea to have the server and other workstations check apply updates automatically to Windows and applications.

    Yes it would. This should administered by a systems or domain administrator.

    Jared
    CE - Microsoft

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply