WITH (NOLOCK) Question

  • Hi,

    I'm kindasorta a newbie (hence putting this in the "newbie" forum)... I've been coding Access databases for a couple decades now, and have been gradually branching into SQL Server for the last couple years, just learning from books, the Internet, and experience.

    I have a general "knowledge" question, that I was hoping somebody would help me with. 🙂

    I have a production database with a SQL Server 2008 R2 Express back-end, and a Microsoft Access front-end (with linked tables - the alternative is way outside of their price range), that is used by a small organization of about 30 "readers" and about 2 "writers." This database has the following tables (just including the relevant stuff):

    dbo.Client_Clients

    - ClientID (int)

    - (expected stuff - FirstName, LastName, etc)

    - SheetsRequiredNumMonths (tinyint)

    - ClientStartDate

    - Audit fields (CreatedDate, CreatedBy, ModifiedBy, etc.)

    dbo.Paperwork_ClientSheets

    - ClientSheetID

    - ClientID

    - MonthNum (tinyint - is just the month of the year)

    - YearNum

    - Audit fields (CreatedDate, CreatedBy, ModifiedBy, etc.)

    There are AutoAudit triggers on both of these tables. This was one of the features on which we sold them on the idea of SQL Server - the ability to track who made changes and when.

    What's going on here, is that each client is supposed to have a sheet written about them every so-many months. It varies by client. So if "SheetsRequiredNumMonths" = 1, then a sheet is required every month beginning on their "ClientStartDate." If it's 2, then a sheet's required every other month, beginning on their "ClientStartDate." If it's 6, then a sheet is required every 6 months, etc. Usually it's 1, but not always.

    Then Access needs to be able to print a report of all of the sheets that are missing (i.e. not entered).

    In order to get all the sheets that a client should have, I created a function (AFAIK, it needed to be a function, in order to be able to show the results in a pass-through query in Access, which can then be used as a basis for an Access report). The function code is as follows:

    USE [ClientFormTracking]

    GO

    /****** Object: UserDefinedFunction [dbo].[fn_GetSheetsRequired] Script Date: 02/27/2015 12:11:54 ******/

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    -- =============================================

    -- Author:<snip>

    -- Create date: 11/14/2014

    -- Description:This function returns all sheets that the client SHOULD have,

    --based on their start date and their SheetsRequiredNumMonths.

    --If the start date is null, it presumes 1/1/2014.

    -- =============================================

    CREATE FUNCTION [dbo].[fn_GetSheetsRequired]

    (

    -- Add the parameters for the function here

    )

    RETURNS

    @SheetsReq TABLE

    (

    -- Add the column definitions for the TABLE variable here

    ClientID int,

    MonthNum tinyint,

    YearNum smallint

    )

    AS

    BEGIN

    -- Fill the table variable with the rows for your result set

    DECLARE @counter int

    DECLARE @currMonthInc int

    DECLARE @defStartDt smalldatetime

    SET @defStartDt = CAST('1/1/2014' AS smalldatetime) //some start dates haven't been entered yet. Default to the first of 2014.

    SET @counter = 1

    WHILE @counter < 13 --stop at 12

    BEGIN

    SET @currMonthInc = 0

    WHILE @currMonthInc < 12

    BEGIN

    SET @currMonthInc = @currMonthInc + @counter

    INSERT INTO @SheetsReq (ClientID, MonthNum, YearNum)

    SELECT ClientID, CAST(MONTH(DATEADD("MONTH",@currMonthInc - @counter, COALESCE(ClientStartDate, @defStartDt))) AS tinyint) AS MonthNum, CAST(YEAR(DATEADD("MONTH",@currMonthInc - @counter, COALESCE(ClientStartDate, @defStartDt))) AS smallint) AS YearNum

    FROM dbo.Client_Clients

    WHERE ExitDate IS NULL AND SheetsRequiredNumMonths = @counter AND DATEADD("MONTH",@currMonthInc - @counter, COALESCE(ClientStartDate, @defStartDt)) <= GETDATE()

    END

    SET @counter = @counter + 1

    END

    RETURN

    END

    GO

    When I tried this, the users quickly started running into errors. This was months ago, so I don't recall exactly what the errors were, but I think what was happening was that if they ran the report, the database suddenly became non-updateable (at least not through the Access front-end).

    I "solved" this by adding "WITH (NOLOCK)" to the "FROM dbo.Client_Clients" line in the above function. Since then, there have been no problems. And even if there were - the slight possibility of slightly inconsistent data is far less of a problem for them, then a non-updateable database would be.

    I've read several articles, though, about the evils of "WITH (NOLOCK)" and so now I'm really curious: what would you (more experienced) people do in this situation, how, and why? 🙂

    Many thanks for the learning experience! 😀

  • I am going to assume that your function takes awhile to run?

    Lets start with what nolock does.

    By default, SQL Server uses what is known as a READ COMMITTED transaction isolation level. The intent of this is to make sure any changes happening to the database are COMMITTED before they are visible to other queries. It does a couple of things for this.

    First, when you are reading from the database, it acquires a 'shared lock' that essentially tells the database not to allow the data you are reading to be updated until you are done reading it. How many rows it locks will depend on the query.

    Also, when you are updating the database, you have a different lock on the data being updated that keeps a query that would read those rows from completing.

    Either way, those locks can prevent another query from completing.

    What nolock does is say "for this query I don't want the shared lock". That means the rows can be changed while you are querying them, and you can also query a row already in the middle of already being changed. People tend to mistakenly like this because if they have locking issues, this makes their queries go faster. But it also means the queries return unreliable information.

    So why did your database seem read only? Why did no lock on the function fix it?

    I am guessing that your query was taking some time to run (each time through the while loop), and kept on acquiring shared locks. If someone then tried to update the locked data, then would be blocked. Because they were doing it from an outside program, the query may have "timed out" before the update could complete.

    When you put in no lock, it stopped acquiring the shared lock, so it didnt make the updates wait.

    EDIT:

    It is further possible that based on the query you used, the database is table scanning your clients table every time through the loop, and quite possible having a table wide lock.

    As for what to do instead of the nolock, I would work on optimizing your function so that it is faster and gets a better query plan. If it returns fast, it won't hold the shared lock long, and updates should not time out.

    I see that Luis has already started giving you hints as to how below. You are going to want the function to be a single query that gives all your desired rows, instead of looping through. At the same time you can probably make your query more SARGABLE which could prevent the table scans, making it both faster, and less likely to lock everything.

  • You have 2 main issues in here which cause performance problems.

    1. Multi-statement table-valued function

    2. (Nested) while loops

    Those two can (and will) kill performance severely.

    Your friends here are called inline table functions[/url] and tally (or number) tables[/url]. I could post an example on how to change the code, but I just need to be sure that I'm getting the correct results.

    I used this sample data:

    CREATE TABLE dbo.Client_Clients(

    ClientID int,

    FirstName varchar(10),

    LastName varchar(10),

    SheetsRequiredNumMonths tinyint,

    StartDate datetime,

    ExitDate datetime)

    INSERT dbo.Client_Clients

    VALUES

    ( 1, 'Peter', 'Parker', 1, '20140101', NULL),

    ( 2, 'Tony', 'Stark', 3, '20140101', NULL),

    ( 3, 'Bruce', 'Banner', 6, '20140101', NULL)

    That gives me 12 rows for Peter, 4 for Tony and 2 for Bruce.

    The rows also stop at December 2014.

    Are those two things correct? If not, what should the results be?

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Nevyn: Depends on what you mean by "a while." The database is fairly small and new, and I think there are less than 200 clients. The Access report that uses this function as its base opens within a couple seconds, I believe.

    IIRC, the errors happened after the user ran the report, looked at the report, optionally printed the report, closed the report, went back to the main "Client Edit" form (which is a bound form that was open the whole time... it's the default form in the database, and is bound because the alternative is too expensive for them), and then tried to edit something.

    HTH, and thanks for the info. 🙂 What would you do instead?

  • Great post, BTW.

    For starters, there is no magic bullet. You kinda sorta found something that alleviated the symptoms, but it did not fix the problem.

    As a starting point, take a look at these reference materials:

    Indexes

    http://www.sqlservercentral.com/stairway/72399/

    I do not know what indexes may or may not be present on these tables, but I would bet that there are none.

    All of the Stairways should be on your to-do list

    Isolation Levels

    This is where the problem likely lies. By default, the isolation level is read committed.

    Start here:

    http://www.brentozar.com/isolation-levels-sql-server/

    Now for your code.

    Try to take this out of the loop. For each iteration of the loop, you hit these tables over and over.

    A set-based query will reduce the work considerably. That is a completely different forum post!

    The WHERE clause is non-sargable.

    Sargable is a made up word that stands for "search argument". This type of where clause cannot be used by any indexes, and usually results in a table scan.

    See where this is going? For each loop, you are probably reading the whole table.

    This part of the where clause is problematic.

    AND DATEADD("MONTH",@currMonthInc - @counter, COALESCE(StartDate, @defStartDt)) <= GETDATE()

    This part:

    COALESCE(StartDate, @defStartDt)

    If the field StartDate is null, then use the value of the variable, correct?

    Can you make the start date field not null, and default it to a value?

    Re-write the logic to this:

    AND StartDate >= DATEADD(MM, <number of months>, GETDATE())

    So, to summarize the next steps.

    1. Re-write the WHERE clause.

    2. Remove the loop. This will be the hardest part.

    3. Review the indexes, or lack of them.

    If you can, please post the actual Execution plan from Management Studio.

    Michael L John
    If you assassinate a DBA, would you pull a trigger?
    To properly post on a forum:
    http://www.sqlservercentral.com/articles/61537/

  • It's been a long time since I've used Access for anything, but if I remember correctly Access creates locks against whatever tables it touches, and doesn't let them go until you close the connection. That would be why you're unable to update while your function is in use...

    What NOLOCK does is tell your query to read uncommitted rows in your table. Whenever you do an update, for example, SQL actually inserts a new row, deletes the old row, and then commits the new row. The danger is that when you're reading uncommitted rows, you could potentially read the original record AND the new record if you happen to read right at that moment when the update takes place; or you could read data that has been rolled back in the case of a failed insert.

    Given that your database sounds like it is being updated manually by humans, it's probably never going to be an issue. The odds of you doing a read at exactly the moment someone else does an update are very, very small - and it doesn't sound like you care if it happens anyway. So while it might not be the "right" answer from a purist point of view, it's probably not a problem either.

    Full disclosure, I use NOLOCK for reports once in while... I am often querying against large tables that get updated often, and NOLOCK lets me avoid locking issues. I have yet to see any problem results from it. It may invoke the wrath of some forum folk, but there it is :hehe:

  • I should have posted this question on a day when I'm not sick. 😀

    First of all, thank you all for your advice thus far. I'm sure it will be really illuminating once my head clears up!

    To answer some questions and add a couple things:

    I should have mentioned in my initial post that there is a bound form open in the background while the Access report is running. The form is the main "Client Edit" form, and is bound to the Clients table. IIRC, the error occurred when the user ran the report, looked at the report, optionally printed the report, closed the report, returned to the main "Client Edit" form, and tried to edit something. (It's bound because the alternative is expensive, and their programming budget is limited. And also because they're easier to use.)

    Michael L John: Regarding Indexes: I do have some in the database, but those two tables just have PK indexes (ClientID in the Clients table, and ClientSheetID in the ClientSheets table). I've been thinking about adding a combined index for ClientID, MonthNum, and YearNum, but haven't been able to wrap my head around the advantages/disadvantages for doing so.

    Filling in the StartDate programmatically is not a good idea in this case, since that field is used by the users for a lot of things. It stores the date that the client started with the organization. The data is still in the process of being filled in, which is why Nulls are allowed - that's how they know what still needs to be entered.

    (Edit to add: How do you post an execution plan from SSMS?)

    Luis Cazares: that seems right... except I'm sure I wanted to grab everything after the StartDate up until GETDATE(). I'm going to have to take another look at the function (when I'm feeling better), since I think you may have found a logic error in the function. Thanks, and if you could post the tally table results (but test with people with different start dates), that would be great! 🙂

    Thanks again! 🙂

  • Hmm

    You should definitely rewrite the function regardless, but based on your description of the bound form, I am not sure that even with a better function you'll be able to eliminate NO LOCK. Not an expert in connecting with access so I'll let others give suggestions for that, but if having a shared lock at any point (even one that should have been released) makes your bound form read only somehow, you could be stuck with no lock.

  • Personally, the single most important thing here is getting rid of the multi-statement table valued function. Those things are nothing but trouble.

    After that, I think everyone else has offered lots of good advice.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • This is an option. It's not perfect but should be better that your current process. Someone else could improve the code. It would be nice to know what are you doing with this function as the problem can be somewhere else.

    CREATE TABLE dbo.Client_Clients(

    ClientID int,

    FirstName varchar(10),

    LastName varchar(10),

    SheetsRequiredNumMonths tinyint,

    StartDate datetime,

    ExitDate datetime

    )

    INSERT dbo.Client_Clients

    VALUES

    ( 1, 'Peter', 'Parker', 1, '20140101', NULL),

    ( 2, 'Tony', 'Stark', 3, '20140201', NULL),

    ( 3, 'Bruce', 'Banner', 6, '20140301', NULL)

    GO

    -- =============================================

    -- Author:<snip>

    -- Create date: 11/14/2014

    -- Description:This function returns all sheets that the client SHOULD have,

    --based on their start date and their SheetsRequiredNumMonths.

    --If the start date is null, it presumes 1/1/2014.

    -- =============================================

    CREATE FUNCTION [dbo].[fn_GetSheetsRequired]

    (

    -- Add the parameters for the function here

    -- @ClientID int --Should you filter instead of querying all the clients?

    )

    RETURNS TABLE

    AS RETURN

    WITH E(N) AS(

    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 --12 rows

    )

    ,cteTally(N) AS(

    SELECT 0

    UNION ALL

    SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL))

    FROM E, E x --Enough for 12 years

    )

    SELECT ClientID,

    MONTH( DATEADD(MONTH, N, ISNULL(StartDate,'20140101'))) MonthNum,

    YEAR( DATEADD(MONTH, N, ISNULL(StartDate,'20140101'))) YearNum

    FROM cteTally t

    JOIN dbo.Client_Clients c ON DATEADD( MONTH, t.N * -1, GETDATE()) >= StartDate

    OR (DATEADD( MONTH, t.N * -1, GETDATE()) >= '20140101'

    AND StartDate IS NULL)

    WHERE ExitDate IS NULL

    AND N % SheetsRequiredNumMonths = 0

    --AND ClientID = @ClientID

    GO

    --Test the function.

    SELECT *

    FROM [dbo].[fn_GetSheetsRequired]()

    ORDER BY ClientID, YearNum, MonthNum

    GO

    --Clean my database, you shouldn't use it or you'll lose data.

    --DROP FUNCTION [dbo].[fn_GetSheetsRequired]

    --DROP TABLE dbo.Client_Clients

    GO

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Katerine459 (2/27/2015)


    (Edit to add: How do you post an execution plan from SSMS?)

    See

    Grant Fritchey

    The Scary DBA

    Author of:

    SQL Server Query Performance Tuning

    and

    SQL Server Execution Plans

    These are available here:

    http://www.red-gate.com/community/books/

    Otherwise, Click the "include actual Execution plan" button on the toolbar in SSMS.

    Run the query

    You will now see three tabs, Results, Messages, and Execution Plan.

    Pick the Execution Plan tab, right click, pick 'Save execution Plan as" and attach it to a post.

    Michael L John: Regarding Indexes: I do have some in the database, but those two tables just have PK indexes (ClientID in the Clients table, and ClientSheetID in the ClientSheets table). I've been thinking about adding a combined index for ClientID, MonthNum, and YearNum, but haven't been able to wrap my head around the advantages/disadvantages for doing so.

    Are these indexes clustered?

    And why do you think adding an index on ClientID, MonthNum, and YearNum will be a good thing?

    See, we are challenging (and probably torturing) you!

    Michael L John
    If you assassinate a DBA, would you pull a trigger?
    To properly post on a forum:
    http://www.sqlservercentral.com/articles/61537/

  • Thanks again to all! All of you have given me a great deal to think about. 😀

    Luis Cazares: Thanks for that! It looks awesome, and I almost understand it, except for this:

    ,cteTally(N) AS(

    SELECT 0

    UNION ALL

    SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL))

    FROM E, E x --Enough for 12 years

    )

    ... what is that? Could you please take me through what that does? Your comment, "Enough for 12 years" implies that it returns 144 numbers (one for each month for 12 years)? And that there's a cross join or an outer join in there somewhere? Is "E, E x" shorthand for "E CROSS JOIN E"? And I'm familiar (slightly) with using SELECT Row_NUMBER() OVER(ORDER BY), but only in the context of getting the first/last/maximum/minumum row for something... what does SELECT NULL do in this context?

    Sorry about the stupid questions, but I really like to understand the ins and outs, and my head is so. Very. Foggy!

    Thanks again, (to everybody, not just Luis Cazares). 🙂

  • It's been a long time since I've used Access for anything, but if I remember correctly Access creates locks against whatever tables it touches, and doesn't let them go until you close the connection. That would be why you're unable to update while your function is in use...

    Unless something has changed for the worse, Access locks the "page" that the record is on. It was therefore possible to make Access locking row level by artificially inflating the record length. This is a long way from locking the entire table.

  • Michael L John (2/27/2015)


    Katerine459 (2/27/2015)


    (Edit to add: How do you post an execution plan from SSMS?)

    See

    Grant Fritchey

    The Scary DBA

    Author of:

    SQL Server Query Performance Tuning

    and

    SQL Server Execution Plans

    These are available here:

    http://www.red-gate.com/community/books/

    Otherwise, Click the "include actual Execution plan" button on the toolbar in SSMS.

    Run the query

    You will now see three tabs, Results, Messages, and Execution Plan.

    Pick the Execution Plan tab, right click, pick 'Save execution Plan as" and attach it to a post.

    Thanks for the general info... I'll check it out. I don't know that I'll use it for this specific thread, though (I'm thinking of using Luis Cazares's solution, once I've figured it out, but I may have to keep WITH NOLOCK, as Nevyn said, because... Microsoft Access bound forms. 😛 )

    Michael L John: Regarding Indexes: I do have some in the database, but those two tables just have PK indexes (ClientID in the Clients table, and ClientSheetID in the ClientSheets table). I've been thinking about adding a combined index for ClientID, MonthNum, and YearNum, but haven't been able to wrap my head around the advantages/disadvantages for doing so.

    Are these indexes clustered?

    And why do you think adding an index on ClientID, MonthNum, and YearNum will be a good thing?

    See, we are challenging (and probably torturing) you!

    They are PK indexes, so they are clustered.

    I'm guessing you're quizzing me on the combined index? Thanks! 🙂

    I'm not actually thinking the combined index on ClientID, MonthNum, and YearNum is necessarily a good thing - I was thinking it before, before I remembered why I did an auto-increment single PK on that table in the first place (because Microsoft Access listbox-based editing). From a logical standpoint, a combined PK may have made more sense, as it would have prevented duplicates, but when using a front-end that has a listbox with Add/Edit/Delete buttons, a single PK is so, so much simpler. And duplicates aren't really an issue - if they happen, they happen, and it doesn't screw anything up in this case, since the point of entering sheets is really to be able to see what records are absent, not what records are there. 🙂

    Although you have gotten me wondering if I shouldn't add a non-clustered index on StartDate (in the Clients table)...

  • Katerine459 (2/27/2015)


    Thanks for the general info... I'll check it out. I don't know that I'll use it for this specific thread, though (I'm thinking of using Luis Cazares's solution, once I've figured it out, but I may have to keep WITH NOLOCK, as Nevyn said, because... Microsoft Access bound forms. 😛 )

    I am not going to wade in to the main part of the pool (plenty of good advice already out there), but I will point out this tidbit: Access isn't something designed to take on hundreds of concurrent users. The optimistic locking it uses works okay for a handful of users, but turns into a performance pit after you exceed MS' guidelines (25 users). Frankly I don't think it even takes 25 before you start to see a major impact but that really depends on how busy your users really are.

    You REALLY want to look at using something that leverages looser binding of forms (ASP.NET or pretty much any of the form options implemented via .NET). Keeping things locked while you're editing will make your life miserable with the kind of concurrency you mentioned.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

Viewing 15 posts - 1 through 15 (of 22 total)

You must be logged in to reply to this topic. Login to reply