Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««12

With Nolock Expand / Collapse
Author
Message
Posted Thursday, October 22, 2009 10:09 AM
Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: 2 days ago @ 3:34 PM
Points: 3,109, Visits: 11,516
You should forget about using NOLOCK.

With SQL Server 2005, set the database to read_committed_snapshot, and your queries will not take locks or be blocked in most situations.

You should read about it in SQL Server 2005 Books Online first, to make sure you understand all the implications.

-- kill all connections to MyDatabase first, then run
use master
alter database [MyDatabase] set allow_snapshot_isolation on
alter database [MyDatabase] set read_committed_snapshot on



Post #807288
Posted Thursday, October 22, 2009 9:12 PM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Wednesday, August 6, 2014 5:19 PM
Points: 30, Visits: 257
One fun thing to know about ORDER BY is that w/o a ORDER BY statement in the ENTERPRISE version of SQL you can't rely on the fact that SELECT * FROM tablename will return back the record names based on the order in the clustered index. SQL 2005 EE / Dev Edition will actually look at any other queries that are running concurrently with the same query and it will hop on the read to return back the same rows that the 2nd query is running and will complete the batch with the read through to the key that started the query. So say you have spid 1 saying SELECT * FROM tableA.. and tableA has 1,000,000 rows in it ordered by Col1. Well if you started a query on a new SPID that was saying SELECT * FROM tableA while the first SPID was executing it wouldn't actualy give you ROW 1 as the first record returned. It could actually return row 500,000- 1mill first and then give you 0 - 499,999.



The I/O from an instance of the SQL Server Database Engine includes logical and physical reads. A logical read occurs every time the Database Engine requests a page from the buffer cache. If the page is not currently in the buffer cache, a physical read first copies the page from disk into the cache.

The read requests generated by an instance of the Database Engine are controlled by the relational engine and optimized by the storage engine. The relational engine determines the most effective access method (such as a table scan, an index scan, or a keyed read); the access methods and buffer manager components of the storage engine determine the general pattern of reads to perform, and optimize the reads required to implement the access method. The thread executing the batch schedules the reads.

Read-Ahead
The Database Engine supports a performance optimization mechanism called read-ahead. Read-ahead anticipates the data and index pages needed to fulfill a query execution plan and brings the pages into the buffer cache before they are actually used by the query. This allows computation and I/O to overlap, taking full advantage of both the CPU and the disk.

The read-ahead mechanism allows the Database Engine to read up to 64 contiguous pages (512KB) from one file. The read is performed as a single scatter-gather read to the appropriate number of (probably non-contiguous) buffers in the buffer cache. If any of the pages in the range are already present in the buffer cache, the corresponding page from the read will be discarded when the read completes. The range of pages may also be "trimmed" from either end if the corresponding pages are already present in the cache.

There are two kinds of read-ahead: one for data pages and one for index pages.

Reading Data Pages
Table scans used to read data pages are very efficient in the Database Engine. The index allocation map (IAM) pages in a SQL Server database list the extents used by a table or index. The storage engine can read the IAM to build a sorted list of the disk addresses that must be read. This allows the storage engine to optimize its I/Os as large sequential reads that are performed in sequence, based on their location on the disk. For more information about IAM pages, see Managing Space Used by Objects.

Reading Index Pages
The storage engine reads index pages serially in key order. For example, this illustration shows a simplified representation of a set of leaf pages that contains a set of keys and the intermediate index node mapping the leaf pages. For more information about the structure of pages in an index, see Clustered Index Structures.


The storage engine uses the information in the intermediate index page above the leaf level to schedule serial read-aheads for the pages that contain the keys. If a request is made for all the keys from ABC to DEF, the storage engine first reads the index page above the leaf page. However, it does not just read each data page in sequence from page 504 to page 556 (the last page with keys in the specified range). Instead, the storage engine scans the intermediate index page and builds a list of the leaf pages that must be read. The storage engine then schedules all the reads in key order. The storage engine also recognizes that pages 504/505 and 527/528 are contiguous and performs a single scatter read to retrieve the adjacent pages in a single operation. When there are many pages to be retrieved in a serial operation, the storage engine schedules a block of reads at a time. When a subset of these reads is completed, the storage engine schedules an equal number of new reads until all the required reads have been scheduled.

The storage engine uses prefetching to speed base table lookups from nonclustered indexes. The leaf rows of a nonclustered index contain pointers to the data rows that contain each specific key value. As the storage engine reads through the leaf pages of the nonclustered index, it also starts scheduling asynchronous reads for the data rows whose pointers have already been retrieved. This allows the storage engine to retrieve data rows from the underlying table before it has completed the scan of the nonclustered index. Prefetching is used regardless of whether the table has a clustered index. SQL Server Enterprise uses more prefetching than other editions of SQL Server, allowing more pages to be read ahead. The level of prefetching is not configurable in any edition. For more information about nonclustered indexes, see Nonclustered Index Structures.

PER BOL..

Advanced Scanning
In SQL Server Enterprise, the advanced scan feature allows multiple tasks to share full table scans. If the execution plan of a Transact-SQL statement requires a scan of the data pages in a table and the Database Engine detects that the table is already being scanned for another execution plan, the Database Engine joins the second scan to the first, at the current location of the second scan. The Database Engine reads each page one time and passes the rows from each page to both execution plans. This continues until the end of the table is reached.

At that point, the first execution plan has the complete results of a scan, but the second execution plan must still retrieve the data pages that were read before it joined the in-progress scan. The scan for the second execution plan then wraps back to the first data page of the table and scans forward to where it joined the first scan. Any number of scans can be combined like this. The Database Engine will keep looping through the data pages until it has completed all the scans. This mechanism is also called "merry-go-round scanning" and demonstrates why the order of the results returned from a SELECT statement cannot be guaranteed without an ORDER BY clause.

For example, assume that you have a table with 500,000 pages. UserA executes a Transact-SQL statement that requires a scan of the table. When that scan has processed 100,000 pages, UserB executes another Transact-SQL statement that scans the same table. The Database Engine schedules one set of read requests for pages after 100,001, and passes the rows from each page back to both scans. When the scan reaches the 200,000th page, UserC executes another Transact-SQL statement that scans the same table. Starting with page 200,001, the Database Engine passes the rows from each page it reads back to all three scans. After it reads the 500,000th row, the scan for UserA is complete, and the scans for UserB and UserC wrap back and start to read the pages starting with page 1. When the Database Engine gets to page 100,000, the scan for UserB is completed. The scan for UserC then keeps going alone until it reads page 200,000. At this point, all the scans have been completed.

Without advanced scanning, each user would have to compete for buffer space and cause disk arm contention. The same pages would then be read once for each user, instead of read one time and shared by multiple users, slowing down performance and taxing resources

ms-help://MS.SQLCC.v10/MS.SQLSVR.v10.en/s10de_0evalplan/html/fb83d250-4252-4322-b57c-3c72de8407c8.htm
Post #807575
Posted Sunday, October 25, 2009 5:30 PM
Right there with Babe

Right there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with Babe

Group: General Forum Members
Last Login: Thursday, July 31, 2014 7:04 PM
Points: 724, Visits: 1,002
-- kill all connections to MyDatabase first, then run
use master
alter database [MyDatabase] set allow_snapshot_isolation on
alter database [MyDatabase] set read_committed_snapshot on

Before changing this and hoping that you get no more blocks...if this is a vendor database check with them that this has been tested.
Post #808417
Posted Monday, October 26, 2009 1:31 AM


SSCrazy Eights

SSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy Eights

Group: General Forum Members
Last Login: Today @ 4:51 AM
Points: 9,928, Visits: 11,199
ranjitrjha (10/22/2009)
GSquared,
Your explanation makes sense but when i am running
select top 10 * from table1
select top 10 * from table1
then for both queries, result set is same.

It's only when I put a nolock hint as in queries below, the result set differs.
select top 10 * from table1
select top 10 * from table1 with (nolock)

Assuming that a clustered index exists on table1:

The first query is likely to produce a query plan which forward scans the clustered index and therefore produces results in logical clustered index order (since it scans the linked list of clustered index pages). This behaviour isn't guaranteed as such, but has always behaved this way, at least as far as I recall. The decision about how to return data to satisfy the query is made by the Storage Engine.

With NOLOCK (or READUNCOMMITTED as I much prefer it), the Storage Engine might choose to perform an IAM-ordered scan rather than following the clustered index chain. From memory, I think this is true if table1 is at least 64 pages in size. Using an IAM scan, data will be returned in the order pages were allocated. The same effect occurs if you specify TABLOCK - the Storage Engine can use the (potentially more efficient) IAM scan if it can guarantee that row movement from things like page splits won't screw things up. With READUNCOMMITTED/NOLOCK you are saying that you don't care about consistency, so the Storage Engine may choose the IAM scan. With TABLOCK, it knows that there can't be any concurrent modifications, so again an IAM scan is safe.

The Storage Engine always uses the IAM method when scanning a heap (though obviously not for any non-clustered indexes!) so that's why I qualified the above to restrict it to the case where table1 is not a heap.

Paul




Paul White
SQL Server MVP
SQLblog.com
@SQL_Kiwi
Post #808514
Posted Monday, October 26, 2009 7:26 AM


SSCarpal Tunnel

SSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal Tunnel

Group: General Forum Members
Last Login: Today @ 9:01 AM
Points: 4,439, Visits: 6,347
I think Paul is the first to get this right: it is almost certainly due to NOLOCK allowing an allocation-order scan. if you want the same rows back on a top repeatably you MUST include an order by that guarantees uniqueness of the returned rows. Otherwise SQL Server can return ANY rows - there is NO 'ordering' in set-based returns other than that given by explicit ORDER BY.

To address a comment someone else made "Using With(Nolock) can improve performance . . .": actually it doesn't technically improve performance as such (other than the tiny fraction gained from not taking/checking some locks). What it really does is improve CONCURRENCY, meaning your read isn't blocked by other DML activity.


Best,

Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru at GMail
Post #808675
Posted Monday, October 26, 2009 8:22 AM


SSCrazy Eights

SSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy Eights

Group: General Forum Members
Last Login: Today @ 4:51 AM
Points: 9,928, Visits: 11,199
I do agree with Mr Guru there about the ORDER BY thing, but as a matter of curiosity, I think the following also guarantees a repeatable order:

WITH TS AS (SELECT * FROM dbo.Test TABLESAMPLE (100 PERCENT) REPEATABLE (91256))
SELECT TOP (10) * FROM TS;

Well, so long as no changes are made to the table anyway...




Paul White
SQL Server MVP
SQLblog.com
@SQL_Kiwi
Post #808726
Posted Monday, October 26, 2009 8:53 AM


SSCarpal Tunnel

SSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal Tunnel

Group: General Forum Members
Last Login: Today @ 9:01 AM
Points: 4,439, Visits: 6,347
Paul White (10/26/2009)
I do agree with Mr Guru there about the ORDER BY thing, but as a matter of curiosity, I think the following also guarantees a repeatable order:

WITH TS AS (SELECT * FROM dbo.Test TABLESAMPLE (100 PERCENT) REPEATABLE (91256))
SELECT TOP (10) * FROM TS;

Well, so long as no changes are made to the table anyway...


I believe that even that is subject to non-repeatability since there is no explicit order by (that I could see) in the query plan. At a minimum I think the following could lead to non-repeatability:

1) anytime the engine determines it can do an allocation-order scan (such as being under READ UNCOMMITTED isolation level)
2) partitioned table
3) merry-go-round scan
4) partitioned view
5) not sure about this one, but since BOL states "Rows on individual pages of the table are not correlated with other rows on the same page." I wonder if that could affect this outcome, despite it being 100 PERCENT sample.

Actually, I just did this type of query against a table with no clustered index and the plan was a table scan with ORDERED = FALSE. I think that at least implies the potential for non-repeatability.


Best,

Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru at GMail
Post #808750
Posted Monday, October 26, 2009 5:12 PM


SSCrazy Eights

SSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy Eights

Group: General Forum Members
Last Login: Today @ 4:51 AM
Points: 9,928, Visits: 11,199
Yes I would think so too - I just haven't been able to make it behave the way we would expect yet.

It's currently sat in my interesting-but-useless pile. The only thing that interests me is how the server enforces the repeatability of the data set, and whether the mechanism used somehow also guarantees order. For example, does it keep a list of page IDs somewhere and always scan them in the same order, or does it make a temporary copy of the data?

The ordered:false in the query plan is relevant; I just wonder whether the storage engine does anything special for tablesample...

Paul




Paul White
SQL Server MVP
SQLblog.com
@SQL_Kiwi
Post #809036
Posted Tuesday, October 27, 2009 9:15 AM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: 2 days ago @ 6:55 AM
Points: 1,332, Visits: 19,319
GSquared (10/22/2009)
The question to ask on that one is: If you were a manager, and had an employee who was really, really fast at his job, but who lied to you, broke things, and sabotaged the work of his co-workers, would you keep him around? If the answer is yes, then with(nolock) is for you.
You say that like it's a bad thing...


---------------------------------------------------------
How best to post your question
How to post performance problems
Tally Table:What it is and how it replaces a loop

"stewsterl 80804 (10/16/2009)I guess when you stop and try to understand the solution provided you not only learn, but save yourself some headaches when you need to make any slight changes."
Post #809388
« Prev Topic | Next Topic »

Add to briefcase ««12

Permissions Expand / Collapse