Updating Covering Indexes

  • Hi,

    During some performance tuning and optimization I performed at a client, I discovered what appears to be strange behavior and I just wanted to ask the community for any comments.

    The behavior is related to the IO's generated when updating a table with covering indexes.

    As a test, I setup the following scenario:

    Create a simple table with three columns

    1. A primary key (clustered).

    2. A column to index on.

    3. A column to include.

    I create three of these tables

    1. A table with only a clustered IX on column 1

    2. A table with a clustered IX on column 1 and a non-clustered IX on column 2

    3. A table with a clustered IX on column 1 and a non-clustered IX on column 2 including column 3

    Finally I run a simple update statement while outputting STATISTICS IO. The update statement updates column 3.

    I am expecting to see the following:

    Update on table 1

    A clustered index scan followed by a clustered index update.

    Update on table 2

    A non-clustered index scan followed by a clustered index update. There are fewer pages in the non-clustered index and the update should not affect the non-clustered index in any way. I am expecting to see fewer reads for this operation.

    Update on table 3

    A non-clustered index scan followed by a clustered index update and a non-clustered update. There are fewer pages in the non-clustered index and the update will also affect the non-clustered index. I am expecting to see somewhere in the region of double the reads for this operation.

    I actually saw the following:

    Update on table 1

    A Clustered index scan - 22 Pages

    Scan count 1, logical reads 24

    elapsed time = 29 ms.

    Update on table 2

    A non-clustered index scan - 18 pages

    Scan count 1, logical reads 10341

    elapsed time = 117 ms.

    Unexpectedly, logical reads have increased massively

    Update on table 3

    A non-clustered index scan - 20 pages

    Scan count 1, logical reads 30343

    elapsed time = 5201 ms.

    Now logical reads have increased three times over the second update.

    Here is my test code:

    -- Make sure tables don't exist

    IF EXISTS(SELECT * FROM sys.tables WHERE name = 'ONLY_PK')

    BEGIN

    DROP TABLE ONLY_PK;

    END;

    IF EXISTS(SELECT * FROM sys.tables WHERE name = 'NO_INCLUDE')

    BEGIN

    DROP TABLE NO_INCLUDE;

    END;

    IF EXISTS(SELECT * FROM sys.tables WHERE name = 'ONE_INCLUDE')

    BEGIN

    DROP TABLE ONE_INCLUDE;

    END;

    -- Create Test Tables

    -- One table to contain only a PK

    CREATE TABLE ONLY_PK

    (PK_COL INT NOT NULL PRIMARY KEY,

    IX_COL INT NOT NULL,

    INC_COL INT NOT NULL);

    GO

    -- One table to contain a PK and an NC index with no included column

    CREATE TABLE NO_INCLUDE

    (PK_COL INT NOT NULL PRIMARY KEY,

    IX_COL INT NOT NULL,

    INC_COL INT NOT NULL);

    GO

    CREATE NONCLUSTERED INDEX IX_NO_INCLUDE ON NO_INCLUDE (IX_COL);

    GO

    -- One table to contain a PK and an NC index containing an included column

    CREATE TABLE ONE_INCLUDE

    (PK_COL INT NOT NULL PRIMARY KEY,

    IX_COL INT NOT NULL,

    INC_COL INT NOT NULL);

    GO

    CREATE NONCLUSTERED INDEX IX_ONE_INCLUDE ON ONE_INCLUDE (IX_COL) INCLUDE (INC_COL);

    GO

    -- Load dummy data into our three test tables

    WITH CTE AS (

    SELECT TOP 5000

    ROW_NUMBER() OVER (ORDER BY a.object_id) AS RowNo

    FROM

    sys.all_columns a

    CROSS JOIN

    sys.all_columns b)

    INSERT INTO ONLY_PK

    SELECT

    RowNo,

    RowNo,

    RowNo

    FROM

    CTE;

    INSERT INTO NO_INCLUDE

    SELECT * FROM ONLY_PK;

    INSERT INTO ONE_INCLUDE

    SELECT * FROM ONLY_PK;

    GO

    -- Have a look at the index page count

    SELECT * FROM sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID('ONLY_PK'), NULL, NULL, 'DETAILED');

    -- 22 pages for clustered index

    SELECT * FROM sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID('NO_INCLUDE'), NULL, NULL, 'DETAILED');

    -- 18 pages for nonclustered index

    SELECT * FROM sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID('ONE_INCLUDE'), NULL, NULL, 'DETAILED');

    -- 20 pages for nonclustered index

    -- Switch on stats for the update test

    SET STATISTICS IO ON;

    SET STATISTICS TIME ON;

    SET STATISTICS XML ON;

    UPDATE ONLY_PK

    SET INC_COL = 0;

    -- Clustered index scan - 22 Pages

    -- Scan count 1, logical reads 24

    -- elapsed time = 29 ms.

    UPDATE NO_INCLUDE

    SET INC_COL = 0;

    -- Non clustered index scan - 18 pages

    -- Scan count 1, logical reads 10341

    -- elapsed time = 117 ms.

    UPDATE ONE_INCLUDE

    SET INC_COL = 0;

    -- Non clustered index scan - 20 pages

    -- Scan count 1, logical reads 30343

    -- elapsed time = 5201 ms.

    SET STATISTICS IO OFF;

    SET STATISTICS TIME OFF;

    -- This code will reset the tables.

    UPDATE ONLY_PK

    SET INC_COL = PK_COL;

    UPDATE NO_INCLUDE

    SET INC_COL = PK_COL;

    UPDATE ONE_INCLUDE

    SET INC_COL = PK_COL;

    The SQL Guy @ blogspot[/url]

    @SeanPearceSQL

    About Me[/url]

  • Hi Sean

    Just a hunch, I rebuilt all of the indexes then updated stats after loading the tables with data - and the results are quite different. Updating the table with the covering index takes 2x the time as it takes to update the table with the noncovering index, instead of 50x. Given this change from a nonsense to a sense result, personally I'd lose interest in the difference between 20,000 and 30,000 logical reads.

    Cheers

    ChrisM

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Hi Chris,

    I'm more curious about the logical reads rather than the execution time. I tried my test script again, but rebuilding all indexes before the update, but saw no noticeable difference in logical reads.

    Thanks

    The SQL Guy @ blogspot[/url]

    @SeanPearceSQL

    About Me[/url]

  • Sean Pearce (5/17/2013)


    Hi Chris,

    I'm more curious about the logical reads rather than the execution time. I tried my test script again, but rebuilding all indexes before the update, but saw no noticeable difference in logical reads.

    Thanks

    That's what I saw too. I'm as curious as you are now, Sean - but also aware that not all "logical reads" are created equal - far from it.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • The one difference there that's pretty likely to be the culprit are the Compute Scalar operators in that last query. That's going to have an effect. It seems a little high in numbers of pages, but it makes sense that you're seeing additional stuff moving through memory in order to support additional operations.

    Also, I'd say fragmentation plays a small part as well. You've got 13 pages on all three indexes, yes, but you'll note (at least I saw this on my system) that there were multiples of fragments going up linearly in almost the same correlation as the page reads went up. Since we're talking very small pages here, it's likely that you're seeing some of the artifacts of mixed extents leading to additional reads as well.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Grant Fritchey (5/17/2013)


    The one difference there that's pretty likely to be the culprit are the Compute Scalar operators in that last query. That's going to have an effect.

    I realize this will have an effect but I don't see them accounting for the logical read numbers.

    Grant Fritchey (5/17/2013)


    It seems a little high in numbers of pages.

    Exactly.

    Grant Fritchey (5/17/2013)


    Since we're talking very small pages here, it's likely that you're seeing some of the artifacts of mixed extents leading to additional reads as well.

    I stumbled across this behavior when trying to tune an update on a larger table. The update in question went from 1,500,000 logical reads to 100,000 just by removing the INCLUDED columns on my indexes. I then setup the test above to try and understand it.

    The SQL Guy @ blogspot[/url]

    @SeanPearceSQL

    About Me[/url]

  • Scenario A - Clustered Index. This should read one page and write one page.

    Scenario B - Clustered Index & non-participating non-clustered Index. This should read one page and write one page.

    Scenario C - Clustered Index & participating non-clustered Index. This should read two pages and write two pages.

    The SQL Guy @ blogspot[/url]

    @SeanPearceSQL

    About Me[/url]

  • I just tried to replicate my findings from yesterday and found some differences. Then I realized I was on a different server with different MAXDOP settings. I reran my test on a server with MAXDOP 0 and got the same figures from yesterday. I will rerun my tests with different MAXDOP settings.

    The SQL Guy @ blogspot[/url]

    @SeanPearceSQL

    About Me[/url]

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply