• Hugo Kornelis (11/27/2008)


    Jacob Luebbers (11/27/2008)


    Hey Hugo,

    Hugo Kornelis (11/27/2008)

    ...

    * The optimizer might choose a non-clustered index to drive the query

    ...

    The technique also uses an index hint specifying the clustered index, with TABLOCKX to ensure that no other modifications could occur to the table during our UPDATE.

    Hi Jacob,

    Point taken. Though the flipside is that TABLOCK increases the chance to get an unordered scan instead of an ordered one. More on that below.

    Jacob Luebbers (11/27/2008)

    Hugo Kornelis (11/27/2008)

    ...

    * The query engine might perform a merry-go-round scan

    ...

    Jeff's article covers this (following on from comments from Gail on the merry-go-round index behaviour): http://www.sqlservercentral.com/articles/Advanced+Querying/61716/[/url].

    I see that Jeff defends his use of ORDER BY in a subquery by refering to observed behaviour, not to any documentation. Exactly the danger I am warning about.

    In fact, this behaviour of TOP 100 PERCENT ... ORDER BY did in fact change with the release of SQL Server 2005, much to the chagrin of many people who were relying on this undocumented "trick". (In all fairness, I must admit that even some of Microsoft's tools rely on this behaviour. And that apparently a very influential customer cooerced Microsoft into bringing this behaviour back in SQL 2005 and SQL 2008, although it does now require a trace flag to be set). If this doesn't prove how dangerous it is to rely on these tricks, then what does?

    Jacob Luebbers (11/27/2008)

    Hugo Kornelis (11/27/2008)

    ...

    * The optimizer might choose to use an unordered scan of the clustered index instead of the current ordered scan

    ...

    The initial insert into the table is ordered explicitly. There was some lively discussion on whether or not the results of a subsequent unordered scan would reliably come back in that order without the guarantee of an ORDER BY... so far no-one has shown a case where they don't (to my knowledge).

    I'll show you two. The first relies on how SQL Server allocates pages to tables. The first 8 pages come from mixed extents; after that all allocation is in uniform extents (i.e. a whole block of 8 pages is allocated to the table at once). I first create a filler table that allocates a mixed extent and some uniform extents. After deleting the table, the extents are free again, but they apparently keep their "mixed/uniform" mark. So than I allocate 6 tables that only use one page (all from the mixed extent), and then the actual table I will test with. The first 2 pages come from the existing mixed extent, and than a new mixed extent is allocated, after the previously allocated (and now empty) uniform extents. The rest of the table uses uniform extents, first the empty ones allocated previously (between the first and second mixed extent), then new ones (after the second mixed extent).

    use tempdb;

    go

    create database testit;

    go

    use testit;

    go

    create table filler (a char(8000));

    go

    insert into filler default values;

    go 100

    drop table filler;

    go

    create table fill1 (a int);

    create table fill2 (a int);

    create table fill3 (a int);

    create table fill4 (a int);

    create table fill5 (a int);

    create table fill6 (a int);

    insert into fill1 default values;

    insert into fill2 default values;

    insert into fill3 default values;

    insert into fill4 default values;

    insert into fill5 default values;

    insert into fill6 default values;

    go

    create table testtable

    (id int identity primary key, val int, filler char(4000));

    declare @i int;

    select @i = 1;

    while @i < 5000

    begin;

    insert into testtable (val, filler) select @i, str(@i);

    set @i = @i + 1;

    end;

    go

    select * from testtable with (nolock) option (maxdop 1);

    go

    use tempdb;

    go

    drop database testit;

    go

    The second example mimicks concurrent behaviour. While you are filling your table, someone else frees some space. Space that will now be used for your table.

    use tempdb;

    go

    create database testit;

    go

    use testit;

    go

    create table filler (a char(8000));

    go

    insert into filler default values;

    go 100

    create table testtable

    (id int identity primary key, val int, filler char(4000));

    declare @i int;

    select @i = 1;

    while @i < 5000

    begin;

    insert into testtable (val, filler) select @i, str(@i);

    set @i = @i + 1;

    -- SIMULATE CONCURRENT ACTIVITY

    if @i = 2000 truncate table filler;

    end;

    go

    select * from testtable with (nolock) option (maxdop 1);

    go

    use tempdb;

    go

    drop database testit;

    go

    (Sorry for the all-lowercase by the way, but I just lost my post because I took too long and I can't be bothered to do the nice formatting again)

    Jacob Luebbers (11/27/2008)

    Hugo Kornelis (11/27/2008)

    ...

    My point is that every undocumented behaviour that you observe, no matter how consistent it appears, should be considered to be a coincidal side effect of how the current version of the software interacts with your current hardware configuration and current data distribution. And that is NOT a safe foundation for building production software.

    ...

    Agree with you here - however I'm willing to take that risk in certain cases. The massive performance gain in certain cases using this technique is compelling, and if the process using it can bear this risk (and the maintainers of that process are mindful of potential changes in this behaviour with updates to the engine) I say go for it.

    Regards,

    Jacob

    This is where we will have to agree to disagree. I've witnessed this too often. When you warn about the danger, management is always "prepared to take the risk, and they will "of course monitor, and plan followup action". But when the shit hits the fan, it's suddenly your fault...

    Maybe you have been lucky not to have experienced this yet 😉 But I won't take any chances 😀

    Yep, the code you wrote works about the way you say it will, but I figured out the problem in your code in the second block. First, I wouldn't write a query using the NOLOCK hint unless absolutely necessary. Take your code in the second code block, and change your last query to use the query hint (index = 1) IN PLACE OF the (nolock) you have. Better yet, copy the query and run both (one with nolock and one with index = 1). You will see that the one with (index = 1) returns an ordered set.

    Edit: And just for s and g's, I ran it also with no hint on the table with the clustered primary key, still returned an ordered result set.

    Another Edit: Added a query using both hints (nolock, index = 1), returned an ordered result set.