• Actually not sure about the 80% value but there is a threshold in which statistically the index will not be a value. But in your example you mention M, F, null and a military database. No matter the percentage of F in the system the statistics are still based on the overall uniqueness of the data in the index. So if you have said index and it has a million rows then the statistical uniqueness is .0003 percent. Due to this, more often than not it will choose a table scan or clustered index scan no matter how many times you run it. That said there are some conditional situations where things will behave differently than the statement leads to believe.

    Ex. -- Try building an querying against this table.

    CREATE TABLE dbo.MilPer (idx int identity(1,1) not null primary key,

    sex char(1) null

    )

    go

    CREATE NONCLUSTERED INDEX IX_MilPer ON dbo.MilPer

     (

     sex

    &nbsp

    GO

    set nocount on

    --Insert 1,000,000 records all M

    declare @x int

    set @x = 0

    while @x < 1000000

    begin

     insert dbo.MilPer (sex) values ('M')

     set @x = @x + 1

    end

    go

    --50000 updated to F

    UPDATE

     M

    SET

     sex = 'F'

    FROM

     dbo.MilPer M

    INNER JOIN

     (SELECT TOP 50000 X.idx FROM dbo.MilPer X ORDER BY NewId()) Y

    ON

     M.idx = Y.idx

    go

    --20000 updated to null

    UPDATE

     M

    SET

     sex = Null

    FROM

     dbo.MilPer M

    INNER JOIN

     (SELECT TOP 20000 X.idx FROM dbo.MilPer X WHERE sex != 'F' ORDER BY NewId()) Y

    ON

     M.idx = Y.idx

    go

    --Verify counts

    select sex, count(*) from dbo.MilPer  group by sex

    GO

    -----Do your testing here.--------

    --Now change the table to this.

    ALTER TABLE dbo.MilPer ADD

     [alt] char(1) NOT NULL CONSTRAINT DF_MilPer_at DEFAULT 'A'

    GO

    -----Do your testing here.--------

    You'll find that when sex and idx were the only columns it will use the non-clustered index. This is because it also happens that the non-clustered index will work as a covering index for SELECT * in that case because the idx column is tucked inside the non-clustered index because of the way clustered indexes relate to non-clustered.

    However once you add the alt column you will find it will do a clustered index seek. This is because the non-clustered index no longer will work as a covering index and the statistic density (based on uniqueness) isn't going to pose high enough to have the system consdier it. If you instead do SELECT idx, sex you will get the non-clustered seek again. That said however neither method executes faster than the other because the data itself in the case isn't dense enough for thorough testing.