Unused index causing longer select performance problems?

  • Just watched this video by Pinal Dave, he is saying that an unused index can cause poor performance in a select.

    I have tried his code but it doesn't give slower performance on my machine.

    Can someone explain this phenomenon or test it to see if it's slower on their machine?

  • Here's a script to test it:

    SET STATISTICS IO, TIME ON
    GO
    SELECT SalesOrderDetailId, OrderQty
    FROM Sales.SalesOrderDetail sod
    WHERE ProductId = (SELECT AVG(ProductId)
    FROM Sales.SalesOrderDetail sod1
    WHERE sod1.SalesOrderId = sod.SalesOrderId
    GROUP BY SalesOrderId)

    GO
    print 'No Indexes xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
    go
    CREATE INDEX IX_SalesOrderDetail_1 ON Sales.SalesOrderDetail(SalesOrderId ASC, ProductId ASC) INCLUDE (SalesOrderDetailId, OrderQty)
    GO
    SELECT SalesOrderDetailId, OrderQty
    FROM Sales.SalesOrderDetail sod
    WHERE ProductId = (SELECT AVG(ProductId)
    FROM Sales.SalesOrderDetail sod1
    WHERE sod1.SalesOrderId = sod.SalesOrderId
    GROUP BY SalesOrderId)
    GO
    print 'Index 1 present xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
    GO
    CREATE INDEX IX_SalesOrderDetail_2 ON Sales.SalesOrderDetail(ProductId ASC, SalesOrderId ASC) INCLUDE (SalesOrderDetailId)
    GO
    SELECT SalesOrderDetailId, OrderQty
    FROM Sales.SalesOrderDetail sod
    WHERE ProductId = (SELECT AVG(ProductId)
    FROM Sales.SalesOrderDetail sod1
    WHERE sod1.SalesOrderId = sod.SalesOrderId
    GROUP BY SalesOrderId)
    GO
    print 'Index 1 and 2 present xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
    GO
    DROP INDEX IX_SalesOrderDetail_1 ON Sales.SalesOrderDetail
    GO
    DROP INDEX IX_SalesOrderDetail_2 ON Sales.SalesOrderDetail
    GO
  • Maybe he's just saying that if you have an index that is never used, that could be a symptom of performance improvements to be made, by creating one that is used by the typical queries against that table?

    -------------------------------------------------------------------------------------------------------------------------------------
    Please follow Best Practices For Posting On Forums to receive quicker and higher quality responses

  • jonathan.crawford wrote:

    Maybe he's just saying that if you have an index that is never used, that could be a symptom of performance improvements to be made, by creating one that is used by the typical queries against that table?

    In the video he runs the query on a table with no index and gets:

    Table 'Worktable'. Scan Count 3, logical reads 36495
    Table 'SalesOrderDetail'. Scan Count 1, logical reads 1246

    Then he creates and index and reruns the query to get:

    Table 'SalesOrderDetail'. Scan Count 2, logical reads 612

    Then creates a second index and runs the query to get:

    Table 'Worktable'. Scan Count 3, logical reads 368495
    Table 'SalesOrderDetail'. Scan Count 1, logical reads 306

    Then drops the second index and the query returns to the same scans as it was before the second index was added.

    Table 'SalesOrderDetail'. Scan Count 2, logical reads 612

    Then he recreates the second index and adds a hint to the query so it only uses the first index. The query returns to the high scans.

    Table 'Worktable'. Scan Count 3, logical reads 368495
    Table 'SalesOrderDetail'. Scan Count 1, logical reads 306

    At this point the query is not using the second index. But then he drops the second index and the query returns to a low number of scans.

    He then drops the seconds index and it return to a low number of reads:

    Table 'SalesOrderDetail'. Scan Count 2, logical reads 612

    I tried it but the performance did not go down when I added the second index. I was using the AdventureWorks2012 database but it looks like he is using the 2014 version, so not sure if this cold be the difference?

  • I'm somewhat aware of this, but I don't have the precise answer. It's some kind of bug or glitch in the optimizer that gets a difference in row counts. I suspect, but again, don't know for certain, it only works with certain versions of SQL Server and you won't see it in all of them.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Grant Fritchey wrote:

    I'm somewhat aware of this, but I don't have the precise answer. It's some kind of bug or glitch in the optimizer that gets a difference in row counts. I suspect, but again, don't know for certain, it only works with certain versions of SQL Server and you won't see it in all of them.

    Thanks Grant, someone has written a comment on the video that it's to do with SQL Server using the statistics from the index even though it doesn't use the index in the query. Pinal seems to agree with this.

    I just could not replicate the problem on SQL 2012.

  • It's likely that it's a version thing then, and certainly, likely to be 2014+ since that's when the cardinality estimation engine changed. Funny enough, Pinal's advice is the opposite of mine (with TONS of caveats). I tell people to NOT just drop indexes that they think are unused because of stuff like this, but in the opposite direction. You can see better performance and plans because of those statistics. Testing, testing, testing is your only friend in this situation.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Grant Fritchey wrote:

    It's likely that it's a version thing then, and certainly, likely to be 2014+ since that's when the cardinality estimation engine changed. Funny enough, Pinal's advice is the opposite of mine (with TONS of caveats). I tell people to NOT just drop indexes that they think are unused because of stuff like this, but in the opposite direction. You can see better performance and plans because of those statistics. Testing, testing, testing is your only friend in this situation.

    It's things like that that killed us when we upgraded from 2012 to 2016.  It affected a lot of the queries... way to many too many to fix regression test.  Thankfully, they had the trace flag to kill the "improved" cardinality estimator.

    I also find myself thinking that, although it made for a very consistent demonstration, if someone wrote a query of that nature in my shop, some high velocity pork chops would come into play. 😀

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • The underlying issue appears to be how merge join decides it wants columns ordered with a composite join predicate. https://dba.stackexchange.com/a/259853/3690

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply