Selectivity Ratio - using Index

  • I have a table with no primary key and a non-clustered composite index of two columns. When I look at the number of distinct values (selectivity ratio), the table has a ratio of 52% unique values. I thought an index should be at least 90% unique in order for it to be used. Yet when I run the query with the index, it only takes 2 seconds to run. When I drop the index, it takes 1.25 minutes to run. Why is it using the index when the index is only 52% unique?

    One of the examples always given is to never have an index on a column that is a flag because it only has 50% distinct values (YES, NO). So, what am I misunderstanding here.

    Thank you!

  • Is the index covering? The 90% (often closer to 99%) applies when SQL needs to do bookmark lookups to the cluster/heap

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • I've seen the optimizer choose to do index scans rather than table scans, even when the difference shouldn't matter. In this case, it sounds like it's a better index than you're making out. Are you sure of the selectivity?

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Dear friends,

    Anybody can tell me deeply, about normalization (All types) with simple examples?

    Vishnu S

    mail : Vishnu.swamynathan@yahoo.co.in

  • vishnu.swamynathan (3/3/2008)


    Dear friends,

    Anybody can tell me deeply, about normalization (All types) with simple examples?

    Vishnu S

    mail : Vishnu.swamynathan@yahoo.co.in

    Since this is a completely different topic from the one under discussion, you should start a new conversation thread instead of sidetracking this one.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • The table has 232,441 rows with 124,376 distinct values.

  • Sorry, please explain what you mean by index 'covering'. Thank you.

  • A covering index is one in which all the columns needed by the query are contained within the index.

    When you listed the row count and distinct value count, was that for all columns within the table? I mean, the table as 200k rows but only half of them are distinct rows? Multiple columns of data and it's all duplicated?

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • WHERE XP.Container_id = @Container_id AND XP.Slot_id = @Slot_id AND XP.IsActive=1

    The index contains container_id and slot_id. So it does not cover the IsActive column?

    The distinct values are for container_id and Slot_id, not the entire table.

    SELECT @total_unique = (select count(*) from (SELECT DISTINCT slot_id, container_id from xref_containerslotDataPrimerPair) as t1)

  • What does the execution plan look like?

    I suspect, but this is a guess, you're getting an index scan and then a bookmark lookup to get to the data and then another filter of some sort to eliminate the records that don't match that final value. While the selectivity of the index isn't very high, I can see how the optimizer would use that instead of the entire table for this operation. Still, the execution plan will tell more.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • That brings up a whole other issue. Even though when I run it with the index it takes 2 seconds versus 1.25 minutes without the index, the execution plan is identical! I do not understand this! So the time is different yet I don't see any difference in the execution plan. The table that has the added index is called within a function that is called within the select statement of a Stored Procedure. So I wonder if this is the problem.

  • The function is probably presenting itself as (doing this from memory, so forgive me if I'm slightly off) a table scan or an external operation, but I think if you tested the query within the function, you'd see two different execution plans there.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply