Actually not sure about the 80% value but there is a threshold in which statistically the index will not be a value. But in your example you mention M, F, null and a military database. No matter the percentage of F in the system the statistics are still based on the overall uniqueness of the data in the index. So if you have said index and it has a million rows then the statistical uniqueness is .0003 percent. Due to this, more often than not it will choose a table scan or clustered index scan no matter how many times you run it. That said there are some conditional situations where things will behave differently than the statement leads to believe.
Ex. -- Try building an querying against this table.
CREATE TABLE dbo.MilPer (idx int identity(1,1) not null primary key,
sex char(1) null
)
go
CREATE NONCLUSTERED INDEX IX_MilPer ON dbo.MilPer
(
sex
 
GO
set nocount on
--Insert 1,000,000 records all M
declare @x int
set @x = 0
while @x < 1000000
begin
insert dbo.MilPer (sex) values ('M')
set @x = @x + 1
end
go
--50000 updated to F
UPDATE
M
SET
sex = 'F'
FROM
dbo.MilPer M
INNER JOIN
(SELECT TOP 50000 X.idx FROM dbo.MilPer X ORDER BY NewId()) Y
ON
M.idx = Y.idx
go
--20000 updated to null
UPDATE
M
SET
sex = Null
FROM
dbo.MilPer M
INNER JOIN
(SELECT TOP 20000 X.idx FROM dbo.MilPer X WHERE sex != 'F' ORDER BY NewId()) Y
ON
M.idx = Y.idx
go
--Verify counts
select sex, count(*) from dbo.MilPer group by sex
GO
-----Do your testing here.--------
--Now change the table to this.
ALTER TABLE dbo.MilPer ADD
[alt] char(1) NOT NULL CONSTRAINT DF_MilPer_at DEFAULT 'A'
GO
-----Do your testing here.--------
You'll find that when sex and idx were the only columns it will use the non-clustered index. This is because it also happens that the non-clustered index will work as a covering index for SELECT * in that case because the idx column is tucked inside the non-clustered index because of the way clustered indexes relate to non-clustered.
However once you add the alt column you will find it will do a clustered index seek. This is because the non-clustered index no longer will work as a covering index and the statistic density (based on uniqueness) isn't going to pose high enough to have the system consdier it. If you instead do SELECT idx, sex you will get the non-clustered seek again. That said however neither method executes faster than the other because the data itself in the case isn't dense enough for thorough testing.