• Technically, if you don't know the value, you don't have a complete representation of the entity.

    If you assume a value, you're basically saying that the current "measure" here is 0 - in other words you have a measure, and at this time, its 0.

    Yes I know this is technical. But to understand this is to "think like the computer thinks".

    Now, if you don't have a complete picture of an entity, what does that do to "selectivity value"? It goes down, right? I mean, to the computer, you have high value targets that maintain complete representations. Selectivity value of a target the computer sees as "incomplete" goes down. Not such a big deal if you don't need high degrees of selectivity for that value (you won't use it in a where clause or a join, its used in aggregation only).

    A zero - even if its proven inaccurate later, is simply "your understanding" of the point in space-time of the measure. If you're applying good design rules, in either case of an OLTP or an OLAP model, you should be able to deal with this.

    Of course this precludes "pre-processing" of things like the result set of an expensive left outer join - where you know you're going to have NULL conditions in the data - its unavoidable. However even there, you may have to carefully consider what makes sense for that condition. 0 or null? Will I have to use it in a where clause or a join? Generally that's how to look at it...

    Yes I know this is a hot-button topic and there is a lot of strong feelings. But when you start going in the direction of massive amounts of data stored in SQL Server, you have to consider these things carefully.

    Discuss it with your team / manager. Find out what makes sense. If you have a "null", is it an accurate depiction of that measure or entity? In other words, does it truly exist? Is that attribute necessary for the existence? If not, is it really an attribute?

    This is where we cross the boundary of human fuzzy logic to computer hard logic. The computer at its heart looks at ON or OFF. It doesn't consider "maybe on but it could be off". Humans, different story.