• Tom.Thomson (8/18/2010)


    David Webb-200187 (8/18/2010)


    I avoid using NULLable columns where practical, but I'm not religious about it.

    That's something I agree with 100% - let's avoid using nullable columns when it is practical to do so, and use them where it is impossible or impractical to avoid them (and yes, I am happy to assert that there are real world situations where it is impossible to avoid nulls - except perhaps by deciding not to bother with that database and the apps that use it after all).

    mtillman-921105 (8/18/2010)


    Redefining NULL on a whim? You can't redefine something that doesn't have a good definition to start with. N/A, unknown, not yet entered, incomplete, etc. can all be valid interpretations of a NULL depending on the context. That's the problem. In order to use it, you have to have a potentially different contextual definition in every table, in some cases multiple definitions per table if you have multiple NULLable columns. Even those can be wrong. Imagine a system where a client's address was NULLable. It might mean that the address had not yet been collected, or it could mean the client was homeless and had no fixed address. The meaning of the NULL address would be ambiguous.

    We have to avoid it being ambiguous. There are several ways of doing this. The simplest way is to give it a single simple meaning - NULL means "the database contains no value for this datum" - and not pretend that it means anything else (like not yet entered, or not applicable, or too secret to be allowed in this DB, or anything else other than that the database doesn't have a value for the datum). That's what Codd advocated when he first proposed having NULL. But ANSI screwed up and it doesn't mean that in SQL. Another way of doing it is to have multiple NULLs; one for each possible reason for the value not being in the database is ridiculous (we can't put a reasonable bound on the number of possible reasons - but please note that does NOT mean that Date's failed attempt at reductio ad absurdum starting from the case of two NULLs is logical or sensible in any sense) but it's quite possible to adopt Codd's later proposal for a two null system where NULLa means "the database has no value for the datum and the reason is not specified" and NULLi means "the database has no value for the datum because the column is inapplicable in this row" - there's no ambiguity and it's perfectly straightforward to deal with the two distinct NULLs.

    True except that in practice, actually being able to make that distinction is an awfully big step.What do you do WHILE you actually decipher whether the machine got unplugged, or became self-aware and stopped measuring that data you were looking for. Do you then need 3 NULLS? (the original definition AND the two latter ones?)

    Perhaps I deal in messy environments, but I find a lot of settings where the only real distinction needed is "is there data there" or "there's no data available". Putting placeholders in data when you just plain don't know and can't know WHY, just to make it "look" prettier, is like putting earrings on a pig....

    I could see having the option for either definition, but if I had to pick just one, I'd go with the original definition for the reason above.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?