Indexes with Include

  • Grant Fritchey (6/29/2009)


    You're right about the DROPCLEANBUFFERS and FREEPROCCACHE.

    Personally I prefer to run all test queries twice and ignore the times of the first result. That way, what I'm seeing in tests does not include compile time or time to read off disk.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (6/29/2009)


    Grant Fritchey (6/29/2009)


    You're right about the DROPCLEANBUFFERS and FREEPROCCACHE.

    Personally I prefer to run all test queries twice and ignore the times of the first result. That way, what I'm seeing in tests does not include compile time or time to read off disk.

    I do both. It really depends on what I'm trying to find out. Having been burned by compile times in the past, if the query is even marginally complex, I like to do a lot of testing with a clean cache.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • 1) I would imagine (hope) that the optimizer would choose the 'most-compact' covering index in the case where multiple indexes cover a query.

    2) DTA and the missing index subsystem really needs to be taken with a grain of salt and thoroughly examined before throwing it's recommendations into production. I have seen it do INCREDIBLY awful things to databases!!!

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • TheSQLGuru (6/30/2009)


    1) I would imagine (hope) that the optimizer would choose the 'most-compact' covering index in the case where multiple indexes cover a query.

    2) DTA and the missing index subsystem really needs to be taken with a grain of salt and thoroughly examined before throwing it's recommendations into production. I have seen it do INCREDIBLY awful things to databases!!!

    Boy do I agree with this. I've got negative faith in the DTA.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • >> negative faith

    New term - I think I may like it! Although perhaps antifaith could be better. 🙂

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Grant Fritchey (6/28/2009)


    Paul White (6/28/2009)


    Grant Fritchey (6/28/2009)


    Every time I ran the query, it picked the widest set of includes, regardless of the order in which the indexes were created. I didn't check reads or timings.

    I ran some tests on 2005 and 2008 with AdventureWorks and about the only constant was that if an exactly-covering index existed, it was chosen. (These were additional indexes I created - not the shipped ones).

    Other than that, even with twenty indexes to choose from, with varying numbers of INCLUDEd columns, in different orders, no clear pattern emerged. I thought for a moment that indexes with the required column *last* in the creation order were preferred; but no.

    Fascinating question, but I ran out of time with it.

    Paul

    Huh. Odd. I did the same thing. I had about five different indexes, all with the same key, but varying include columns, but duplicated, like the OP. Every single time, it chose the index with the most columns. Couldn't tell you why. I just didn't dig much past that yet.

    I think it might make sense if we consider the fact that the index that covers the maximum columns is in fact the most reusable.

    This is even more true with test data which are sometimes just a few rows only, since sql server anyway deals with pages and as such must anyway bring in entire page.

    To test this farther we might want to use the following:

    1) Have tables with lots of data

    2) Test with indexes that radically differ in the number of included columns

    3) Create a situation that is contradicting to caching, say by executing a query and after the results are already cached add a more covering index and then force recompilation of the query without clearing the cache (say by using option recompile) and see if it will use the cached index or the more covering one.

    While I have not done so far any tests yet, the argument that I presented is assuming that the cost of using the different indexes are the same, however of this is not the case then we need to farther investigate.

    ( The cost can be determined by forcing the use of the indexes using index hints and then look at the execution plan to determine the cost)

  • I'm pretty sure that it depends on the way the index is accessed.

    If it is accessed by an index seek (nested loop), then it shouldn't matter how many extra columns are included as they won't increase the cost of an individual seek. (Ok, if you had enough included columns to increase the depth of the index, then *maybe* it would prefer a narrower index, I would be suprised. Gail -- care to test this?)

    If the index is accessed in a scan, then it should select the narrowest covering index (less IO). This is easy to see when you do " select clus_key from t " and it chooses to scan a non-clustered index.

  • TheSQLGuru (6/30/2009)


    >> negative faith

    New term - I think I may like it! Although perhaps antifaith could be better. 🙂

    Or simply DTA = ANTItuning

  • Dear All,

    I have a question with regards to that please.

    Ex: Index Idx1 on col1 with include on col2 and col3

    The question is if i run and update statment as follows

    Update Tbl1 set col2=anyvalue

    So here i am updating the included column in the index will that update involve an update to the index as well?

    Thanks

    Nader

  • nadersam (3/27/2013)


    Dear All,

    I have a question with regards to that please.

    Ex: Index Idx1 on col1 with include on col2 and col3

    The question is if i run and update statment as follows

    Update Tbl1 set col2=anyvalue

    So here i am updating the included column in the index will that update involve an update to the index as well?

    Thanks

    Nader

    Yes, but unlike when you update a key column, only the leaf level of the index needs to updated.

    If you run this code:

    CREATE TABLE Test

    (

    col1 INT NOT NULL,

    col2 CHAR(1),

    col3 TINYINT

    );

    GO

    CREATE NONCLUSTERED INDEX IX_test ON Test(col1) INCLUDE(col2, col3);

    GO

    INSERT INTO Test

    SELECT DISTINCT

    AC.object_id,

    LEFT(OBJECT_NAME(AC.object_id), 1),

    ABS(AC.object_id % 9)

    FROM

    master.sys.all_columns AS AC;

    GO

    UPDATE

    Test

    SET

    col2 = 'c'

    WHERE

    col2 IS NULL;

    GO

    DROP TABLE TEST;

    and show the actual execution plan, while you don't see it in the graphical execution plan if you look at the plan XML you'll see something like this:

    <Update DMLRequestSort="false">

    <Object Database="[tempdb]" Schema="[dbo]" Table="[Test]" IndexKind="Heap" />

    <Object Database="[tempdb]" Schema="[dbo]" Table="[Test]" Index="[IX_test]" IndexKind="NonClustered" />

    So you can see both table and index are updated. A tool like SQLSentry's Plan Explorer will show you details in a more readable format.

  • Dear Jack,

    Thanks fro your reply, what i understand from your reply is that it has an impact but not that serious, is that right?., specially for tables with large number of records and update statements updating multiple records.

    Jack Corbett (3/27/2013)


    nadersam (3/27/2013)


    Dear All,

    I have a question with regards to that please.

    Ex: Index Idx1 on col1 with include on col2 and col3

    The question is if i run and update statment as follows

    Update Tbl1 set col2=anyvalue

    So here i am updating the included column in the index will that update involve an update to the index as well?

    Thanks

    Nader

    Yes, but unlike when you update a key column, only the leaf level of the index needs to updated.

    If you run this code:

    CREATE TABLE Test

    (

    col1 INT NOT NULL,

    col2 CHAR(1),

    col3 TINYINT

    );

    GO

    CREATE NONCLUSTERED INDEX IX_test ON Test(col1) INCLUDE(col2, col3);

    GO

    INSERT INTO Test

    SELECT DISTINCT

    AC.object_id,

    LEFT(OBJECT_NAME(AC.object_id), 1),

    ABS(AC.object_id % 9)

    FROM

    master.sys.all_columns AS AC;

    GO

    UPDATE

    Test

    SET

    col2 = 'c'

    WHERE

    col2 IS NULL;

    GO

    DROP TABLE TEST;

    and show the actual execution plan, while you don't see it in the graphical execution plan if you look at the plan XML you'll see something like this:

    <Update DMLRequestSort="false">

    <Object Database="[tempdb]" Schema="[dbo]" Table="[Test]" IndexKind="Heap" />

    <Object Database="[tempdb]" Schema="[dbo]" Table="[Test]" Index="[IX_test]" IndexKind="NonClustered" />

    So you can see both table and index are updated. A tool like SQLSentry's Plan Explorer will show you details in a more readable format.

  • nadersam (3/27/2013)


    Dear Jack,

    Thanks fro your reply, what i understand from your reply is that it has an impact but not that serious, is that right?., specially for tables with large number of records and update statements updating multiple records.

    Yes it has an impact, but you have to determine if that impact is significant for your workload. Will there be a lot of updates to the included column? Are there enough queries that return the included column to negate the maintenance impact of the updates?

    If you run this script:

    CREATE TABLE Test

    (

    col1 INT NOT NULL,

    col2 CHAR(1),

    col3 TINYINT

    );

    GO

    INSERT INTO Test

    SELECT DISTINCT

    AC.object_id,

    LEFT(OBJECT_NAME(AC.object_id), 1),

    ABS(AC.object_id % 9)

    FROM

    master.sys.all_columns AS AC;

    GO

    /* See what happens in IO which is usually the most expensive part of the plan */

    SET STATISTICS IO ON;

    GO

    RAISERROR('Point Update without an index', 10, 1) WITH nowait;

    UPDATE

    Test

    SET

    col2 = 'z'

    WHERE

    col1 = -1061705188;

    GO

    RAISERROR('Range Update without an index', 10, 1) WITH nowait;

    UPDATE

    Test

    SET

    col2 = 'c'

    WHERE

    col3 = 7;

    GO

    SET STATISTICS IO OFF;

    GO

    CREATE NONCLUSTERED INDEX IX_test ON Test(col1) INCLUDE(col2, col3);

    GO

    SET STATISTICS IO ON;

    GO

    RAISERROR('Point Update with an index with the column included ', 10, 1) WITH nowait;

    UPDATE

    Test

    SET

    col2 = 'y'

    WHERE

    col1 = -1061705188;

    GO

    RAISERROR('Range Update with an index with the column included ', 10, 1) WITH nowait;

    UPDATE

    Test

    SET

    col2 = 'd'

    WHERE

    col3 = 7;

    GO

    SET STATISTICS IO OFF;

    GO

    DROP INDEX dbo.TEst.IX_test;

    GO

    CREATE NONCLUSTERED INDEX IX_test ON Test(col1, col2, col3);

    GO

    SET STATISTICS IO ON;

    GO

    RAISERROR('Point Update with an index with the column as part of the key', 10, 1) WITH nowait;

    UPDATE

    Test

    SET

    col2 = 'x'

    WHERE

    col1 = -1061705188;

    GO

    RAISERROR('Range Update with an index with the column as part of the key ', 10, 1) WITH nowait;

    UPDATE

    Test

    SET

    col2 = 'e'

    WHERE

    col3 = 7;

    GO

    SET STATISTICS IO OFF;

    GO

    DROP TABLE TEST;

    You can see the impact it has when doing a seek on the index key and when doing a range not using the index key.

    In this example I've added clustered primary key:

    CREATE TABLE Test

    (

    pkcol INT IDENTITY(1, 1)

    PRIMARY KEY,

    col1 INT NOT NULL,

    col2 CHAR(1),

    col3 TINYINT

    );

    GO

    INSERT INTO Test

    SELECT DISTINCT

    AC.object_id,

    LEFT(OBJECT_NAME(AC.object_id), 1),

    ABS(AC.object_id % 9)

    FROM

    master.sys.all_columns AS AC;

    GO

    /* See what happens in IO which is usually the most expensive part of the plan */

    SET STATISTICS IO ON;

    GO

    RAISERROR('Point Update using Clustered PK without an index', 10, 1) WITH nowait;

    UPDATE

    Test

    SET

    col2 = '1'

    WHERE

    col1 = 1;

    GO

    RAISERROR('Point Update without an index', 10, 1) WITH nowait;

    UPDATE

    Test

    SET

    col2 = 'z'

    WHERE

    col1 = -1061705188;

    GO

    RAISERROR('Range Update without an index', 10, 1) WITH nowait;

    UPDATE

    Test

    SET

    col2 = 'c'

    WHERE

    col3 = 7;

    GO

    SET STATISTICS IO OFF;

    GO

    CREATE NONCLUSTERED INDEX IX_test ON Test(col1) INCLUDE(col2, col3);

    GO

    SET STATISTICS IO ON;

    GO

    RAISERROR('Point Update using Clustered PK with an index with the column included', 10, 1) WITH nowait;

    UPDATE

    Test

    SET

    col2 = '2'

    WHERE

    col1 = 1;

    GO

    RAISERROR('Point Update with an index with the column included ', 10, 1) WITH nowait;

    UPDATE

    Test

    SET

    col2 = 'y'

    WHERE

    col1 = -1061705188;

    GO

    RAISERROR('Range Update with an index with the column included ', 10, 1) WITH nowait;

    UPDATE

    Test

    SET

    col2 = 'd'

    WHERE

    col3 = 7;

    GO

    SET STATISTICS IO OFF;

    GO

    DROP INDEX dbo.TEst.IX_test;

    GO

    CREATE NONCLUSTERED INDEX IX_test ON Test(col1, col2, col3);

    GO

    SET STATISTICS IO ON;

    GO

    RAISERROR('Point Update using Clustered PK with an index with the column as part of the key', 10, 1) WITH nowait;

    UPDATE

    Test

    SET

    col2 = '3'

    WHERE

    col1 = 1;

    GO

    RAISERROR('Point Update with an index with the column as part of the key', 10, 1) WITH nowait;

    UPDATE

    Test

    SET

    col2 = 'x'

    WHERE

    col1 = -1061705188;

    GO

    RAISERROR('Range Update with an index with the column as part of the key ', 10, 1) WITH nowait;

    UPDATE

    Test

    SET

    col2 = 'e'

    WHERE

    col3 = 7;

    GO

    SET STATISTICS IO OFF;

    GO

    DROP TABLE TEST;

  • Dear Jack,

    Thanks again for your valuable information, i will try it and let you know.

    Regards

    Nader

Viewing 13 posts - 16 through 27 (of 27 total)

You must be logged in to reply to this topic. Login to reply