• While troubleshooting a poor performing query this morning I came across something which surprised me - actually, two things:

    1) one of the tables being queried defined no primary key or clustered index

    2) the optimiser chose an index seek over an index scan for the query

    For argument's sake, let's put aside the first shocking point for now. In fact, look past all the poor design decisions. It's the optimiser's behaviour in the given scenario which interests me.

    The heap being queried comprises ~9.5M records in a single partition occupying ~36GB of space or ~5M pages (it's a wide denormalised table). A non-clustered index exists on a char(1) column called 'TransType' with density of 0.25:

    RANGE_HI_KEY EQ_ROWS

    9,581,682

    A 264

    H 263

    X 98

    It also defines 4 INCLUDED fields: PolNo, PolSeq, PolYear, PolMonth and occupies 3 levels over 41,482 pages.

    Finally, here's the portion of the query which surprised me:

    SELECT

    PolNo,

    PolSeq,

    PolYear,

    PolMonth

    FROM

    dbo.PolHistory

    WHERE

    TransType = ' '; --41,575 logical reads

    The resulting plan consists of a single Index Seek operator. The index stats utilised a FULLSCAN and are up-to-date.

    Sure, the index is covering, but given the value being seeked has such poor selectivity, why isn't a scan of the leaf nodes chosen over 9.5M traversals the index structure?