While troubleshooting a poor performing query this morning I came across something which surprised me - actually, two things:
1) one of the tables being queried defined no primary key or clustered index
2) the optimiser chose an index seek over an index scan for the query
For argument's sake, let's put aside the first shocking point for now. In fact, look past all the poor design decisions. It's the optimiser's behaviour in the given scenario which interests me.
The heap being queried comprises ~9.5M records in a single partition occupying ~36GB of space or ~5M pages (it's a wide denormalised table). A non-clustered index exists on a char(1) column called 'TransType' with density of 0.25:
It also defines 4 INCLUDED fields: PolNo, PolSeq, PolYear, PolMonth and occupies 3 levels over 41,482 pages.
Finally, here's the portion of the query which surprised me:
TransType = ' '; --41,575 logical reads
The resulting plan consists of a single Index Seek operator. The index stats utilised a FULLSCAN and are up-to-date.
Sure, the index is covering, but given the value being seeked has such poor selectivity, why isn't a scan of the leaf nodes chosen over 9.5M traversals the index structure?