OFFSET and FETCH NEXT, inconsistent execution plan

  • Hi All,

    I've been doing some testing with some of the new features of SQL 2012, and I've found some strange inconsistencies in the way SQL is generating it's execution plans. In certain situations it's not using indexes when it could, and I can't figure out why.

    I've got the following simple table...

    [ID] [uniqueidentifier] (primary key, non-clustered)

    [Title] [nvarchar](150)

    [DateCreated] [datetime] (indexed, non-clustered)

    I've added 100,000 test rows.

    The following 2 queries I would think should be equivalent....

    -Query 1

    ------

    select * from [Table_2]

    ORDER BY datecreated asc

    OFFSET 50000 ROWS

    FETCH NEXT 40000 ROWS ONLY;

    ------

    -Query 2

    ------

    declare @StartRow bigint

    declare @NumRows bigint

    set @StartRow = 50000

    set @NumRows = 40000

    select * from [Table_2]

    ORDER BY datecreated asc

    OFFSET @StartRow ROWS

    FETCH NEXT @NumRows ROWS ONLY;

    ------

    The odd thing is that the 1st query runs much slower. According to the execution plan it's relative cost is 96% compared to 4%.

    The plan shows that the first query is using table scan + sort, but the second query is using an index scan.

    I'm a bit confused as to why SQL server isn't using the index for datecreated for the first query, but it is for the second.

    The cost for the query is actually the same as if I order by the non-indexed column "title".

    The other odd thing is that even if I force SQL to use the index with...

    WITH (INDEX(IX_Table_2))

    ... this doesn't help, and actually increases the cost slightly, even though the execution plan then looks the same for both queries

    Mind you having the variables makes things a lot more flexible, but I would really like to know why the other version of the query is less efficient, when logically I would think, if anything, it should be the other way around.

    Any thoughts?

    Eugene

  • If you can post the execution plans, I would be guessing less, but, guessing, I'd say that it's probably a parameter sniffing issue. Take a look at the estimated values for each of the plans.

    And yeah, I'm happy to see nice consistent behavior on query hints. They're frequently more painful than helpful.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Agreed, I reproduced the test and noticed that the second query (that used a index scan) had a weird estimate for the output. Image attached.

    Actual number of rows = 70000 (scan so presumably reads first 70000 to get rows 50001 to 70000 as required)

    Estimated I/O cost = 0.238681

    Estimated Operator cost / Subtree cost = 0.0036486

    Estimated Number of rows = 100!!!!!

    So estimated operator cost = 100/70000 * 0.238681 = 0.0036484 (approx)

    Don't know where the 100 estimated number of rows comes from.

    Fitz

  • Disparity between estimated & actual points towards either bad statistics on the data, or good statistics, but the data is skewed which is causing bad parameter sniffing for some values.

    For posting the execution plan, screen captures don't really work well. You can export the execution plan as a *.sqlplan file and post that here. That way all available information is there. Also, I'd post both plans, the good one & the bad one so that comparisons can be made.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • *.sqlplans attached. (SQL below used to create the sample data - know this isn't optimal but didn't want to spend too much brain power on it, got bored waiting at 90,000 rows).

    Statistics have been updated with full scan.

    create database TestOffset;

    go

    use TestOffset;

    go

    create table TblOffset(

    ID uniqueidentifier primary key nonclustered,

    Title varchar(150),

    DateCreated datetime

    )

    go

    create index DCIndex on TblOffset(DateCreated)

    go

    go

    set showplan_xml on;

    go

    set statistics xml on;

    go

    declare @rows int=1, @maxrows int=100000

    while @rows <= @maxrows

    begin

    insert into TblOffset

    select newid(), 'AAAA', dateadd(dd,@rows % 730,'2010-01-01')

    set @rows = @rows + 1

    end

    select * from TblOffset

    go

    update statistics TblOffset with fullscan

    go

    -- table scan, sort, top = 7.35012

    select *

    from TblOffset

    order by DateCreated

    offset 50000 rows fetch next 20000 rows only;

    go

    -- table scan, sort, top = 7.35012

    select *

    from TblOffset with (index (DCIndex))

    order by DateCreated

    offset 50000 rows fetch next 20000 rows only;

    go

    -- index scan/RIDlookup, nested loop, top = 0.305035

    declare @offset bigint = 50000

    declare @nextrows bigint = 20000

    select *

    from TblOffset

    order by DateCreated

    offset @offset rows fetch next @nextrows rows only;

    go

    Fitz

  • Mark Fitzgerald-331224 (5/29/2012)


    *.sqlplans attached. (SQL below used to create the sample data - know this isn't optimal but didn't want to spend too much brain power on it, got bored waiting at 90,000 rows).

    Statistics have been updated with full scan.

    create database TestOffset;

    go

    use TestOffset;

    go

    create table TblOffset(

    ID uniqueidentifier primary key nonclustered,

    Title varchar(150),

    DateCreated datetime

    )

    go

    create index DCIndex on TblOffset(DateCreated)

    go

    go

    set showplan_xml on;

    go

    set statistics xml on;

    go

    declare @rows int=1, @maxrows int=100000

    while @rows <= @maxrows

    begin

    insert into TblOffset

    select newid(), 'AAAA', dateadd(dd,@rows % 730,'2010-01-01')

    set @rows = @rows + 1

    end

    select * from TblOffset

    go

    update statistics TblOffset with fullscan

    go

    -- table scan, sort, top = 7.35012

    select *

    from TblOffset

    order by DateCreated

    offset 50000 rows fetch next 20000 rows only;

    go

    -- table scan, sort, top = 7.35012

    select *

    from TblOffset with (index (DCIndex))

    order by DateCreated

    offset 50000 rows fetch next 20000 rows only;

    go

    -- index scan/RIDlookup, nested loop, top = 0.305035

    declare @offset bigint = 50000

    declare @nextrows bigint = 20000

    select *

    from TblOffset

    order by DateCreated

    offset @offset rows fetch next @nextrows rows only;

    go

    Fitz

    Yeah, it still feels like a parameter sniffing issue. Precise values versus estimates.

    Again, not at all surprised that the index hint doesn't work. They seldom do.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Try adding OPTION (RECOMPILE) to the second query or use sp_executesql with parameters to do the comparison.

    SQL Server uses different approach to estimate cardinality when you use parameters or constants than when you use variables. The value of the variables are not known during compilation time.

    --

    AMB

  • hunchback (8/21/2013)


    Try adding OPTION (RECOMPILE) to the second query or use sp_executesql with parameters to do the comparison.

    SQL Server uses different approach to estimate cardinality when you use parameters or constants than when you use variables. The value of the variables are not known during compilation time.

    --

    AMB

    Just be cautious there because a statement recompile can use the value of a variable for sniffing purposes.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Eugene-928407 (5/25/2012)


    The odd thing is that the 1st query runs much slower. According to the execution plan it's relative cost is 96% compared to 4%.

    It is not surprising that the performance is different. What is surprising is that the first query is the slow one. To wit, in this query SQL Server has full information; in the second the optimizer has no clue what is those variables and makes a blanket guess. And this time this gives better result.

    Grant keeps talking about parameter sniffing, but there are not really any parameters to sniff here.

    Alejandro (welcome! by the way) suggested using OPTION (RECOMPILE) and this should help to make the second slow as well, as now the variables will be handles as constants.

    [font="Times New Roman"]Erland Sommarskog, SQL Server MVP, www.sommarskog.se[/font]

  • Thanks, Erland!

    I decided to give it a try after seeing your post in the private group about your move.

    --

    AMB

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply