Not using Index

  • HI all, I am doing a simple select query, returns about 4000 records from 1570321 records.

    select * from sales where (sales_id = 1123013 AND sales_in_no >= 199701 AND sales_in_no <= 199701)

    The query takes about 1min to run.

    Index sales_idx is created for sales_id and sales_in_no. But I looked at the sql profiler, doesnt look like sql server is using this index. Is there any reason why the index is not been used??

    Thanks for your suggestions.

  • Just outta curiosity, what is this for?

    salesin_no >= 199701 AND sales_in_no <= 199701

    It looks like that would select everything in that column.

    Depending on the order of the columns in your index the optimizer may find it more efficient to just scan the entire table.

    What results are you getting from the execution plan when you run it?

    -Luke.

    To help us help you read this[/url]For better help with performance problems please read this[/url]

  • - Is sales_id the first column of your index ?

    - is the other column the first one of the index ?

    - from your total column distribution, which percentage is of sales_id = 1123013 ?

    - from your total column distribution, which percentage is of the other column matches your search value ?

    - is your index well organized ?

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • Sorry, that was a typo.

    select * from sales where (sales_id = 1123013 AND sales_in_no >= 199901 AND sales_in_no <= 199701)

    This is the output I get from the execution plan:

    Physical Operation: Clustered Index scan

    Logical Operation: Clustered Index scan

    Object

    [test].[dbo].[sales].[sales.pk]

    Hall of Fame,

    Yes, sales_id is the first column of my index.

    About 70% of the total column distribution is of sales_id = 1123013.

    About 60% of the total column distribution matches the search value of the other column.

    This is the index:

    USE [test]

    GO

    CREATE NONCLUSTERED INDEX [SALES_IDX] ON [dbo].[SALES]

    (

    [SALES_ID] ASC,

    [SALES_IN_NO] ASC

    )WITH (SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF) ON [PRIMARY]

    Will it help if I use clustered index? Thanks.

  • since so much of the table would be included in your index it's possible that the optimiser wants to do the scan because it thinks it's more efficient than do all of the bookmark lookups for the rest of the columns. I'd be interested if it would use the index if you didn't use Select *

    but actually used

    SELECT sales_id, sales_in_no

    FROM sales

    where (sales_id = 1123013 AND sales_in_no >= 199901 AND sales_in_no <= 199701)

    To help us help you read this[/url]For better help with performance problems please read this[/url]

  • I tried that, no luck. It didnt use the index when I selected just 2 columns..

  • Check whats the cost involved(from execution plan) when you specify the required index by using index hint. Turn on statistics (set statistics io on) and check logical and physical reads.

    Do the same with normal query execution(using clustered index) and compare the figures. Make sure you empty the cache before executing the query for the second time otherwise records will be thre in cache in advance.

    Is statistics updated for the indexes involved?

    MJ

  • What's the fragmentation on that index and the clustered index like?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • - Can you rebuild your indexes (clustered and non clustered) ?

    - If this behavior is still the same afterward, you can force it to use the index, but then I strongly advise to document why you used the hint.

    (Because you'll have to re-evaluate its need with every sqlserver hotfix/sp/upgrade you apply)

    - Don't use hints unless you really have to and unless you tested they really pay off !

    /* dd 20080921 used index hint because the index was not used and the gain is considerable */

    Select ...

    from [dbo].[SALES] with (index(SALES_IDX))

    where sales_id = 1123013

    AND sales_in_no >= 199701

    AND sales_in_no <= 199701

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • psangeetha (9/19/2008)


    Sorry, that was a typo.

    select * from sales where (sales_id = 1123013 AND sales_in_no >= 199901 AND sales_in_no <= 199701)

    This is the output I get from the execution plan:

    Physical Operation: Clustered Index scan

    Logical Operation: Clustered Index scan

    Object

    [test].[dbo].[sales].[sales.pk]

    Hall of Fame,

    Yes, sales_id is the first column of my index.

    About 70% of the total column distribution is of sales_id = 1123013.

    About 60% of the total column distribution matches the search value of the other column.

    This is the index:

    USE [test]

    GO

    CREATE NONCLUSTERED INDEX [SALES_IDX] ON [dbo].[SALES]

    (

    [SALES_ID] ASC,

    [SALES_IN_NO] ASC

    )WITH (SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF) ON [PRIMARY]

    Will it help if I use clustered index? Thanks.

    Heh... it's still a typo... 2nd value is greater than the first. Since nothing is found, it will take longer.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • psangeetha (9/19/2008)


    Sorry, that was a typo.

    select * from sales where (sales_id = 1123013 AND sales_in_no >= 199901 AND sales_in_no <= 199701)

    This is the output I get from the execution plan:

    Physical Operation: Clustered Index scan

    Logical Operation: Clustered Index scan

    Object

    [test].[dbo].[sales].[sales.pk]

    Hall of Fame,

    Yes, sales_id is the first column of my index.

    About 70% of the total column distribution is of sales_id = 1123013.

    About 60% of the total column distribution matches the search value of the other column.

    This is the index:

    USE [test]

    GO

    CREATE NONCLUSTERED INDEX [SALES_IDX] ON [dbo].[SALES]

    (

    [SALES_ID] ASC,

    [SALES_IN_NO] ASC

    )WITH (SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF) ON [PRIMARY]

    Will it help if I use clustered index? Thanks.

    It's not the query that takes time, but fetching all data. 60% is a lot, sqlserver will use index if the resultset is a small part of table, 15% or less, depending on table size and other things.

    Clustered index scan is the same as table scan and is more cost effective in your case than index seek.

    Also, try to use between, as query can be better optimized and typos like this:

    select * from sales where (sales_id = 1123013 AND sales_in_no between 199901 AND 199701)

    become more obvious.

  • Robert (9/22/2008)


    It's not the query that takes time, but fetching all data. 60% is a lot, sqlserver will use index if the resultset is a small part of table, 15% or less, depending on table size and other things.

    That's if bookmark lookups are required, and often the tipping point is closer to 1%. If the index is covering then SQL will normally use the NC index, even if 100% of the table is required.

    I did a test of that a while back here. Will find the link.

    Edit: http://www.sqlservercentral.com/Forums/FindPost564821.aspx

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (9/22/2008)


    That's if bookmark lookups are required, and often the tipping point is closer to 1%. If the index is covering then SQL will normally use the NC index, even if 100% of the table is required.

    You're right. But queries with covering index are relatively rare. Tipping point depends on lot of things if query becomes a complex join. In real world it can be over 10% even with huge resultsets or it can happen to choose full table scan even if query is on primary key (which I consider a bug in the engine).

  • Can't you use Between instead of = ???

    🙂

  • Thanks for all your comments. I've changed the code to use between and the execution time has reduced. THe application team is currently happy with it now. If they come back again, I am planning on executing the query by forcing the index.

    Thanks again.

Viewing 15 posts - 1 through 15 (of 24 total)

You must be logged in to reply to this topic. Login to reply