So... what's wrong with null? Is it any more correct to have null or a zero length string for a middle name? I notice the author didn't provide a middle name. So does he not have one or did he just not provide it. I could provide a zero length string (blank) or null or you could have a table of "names" and link the person only to those "names" which they have and indicate the position (first, middle, last) but that would be silly. Normalization is a tool as is the judicious use of null. When carried to any extreme you become dogmatic. So, what's correct for an unknown value null or blank? Maybe it depends on context.
I run a simple business where I only ship items when all the items for the order are available (no back orders). So ShippedOn is a column in the OrderHeader table. When I receive an order, it hasn't shipped yet so, what date should be used? I could create a table of Shipped Orders but that involves a join. Also, taking that route I'd have to use an outer join to find pending orders. I could add IsShipped (bit) column to indicate if it's shipped or not but, that's tramp data that could be answered if I just leave the ShippedOn date column. I could make up a default date (1/1/1900) but that's just as valid as allowing nulls.
Allowing nulls is a tool, just like not allowing nulls. It depends on context, usage, business rules. You know -- reality.
I dig the article Mike. Way to explore an often-discussed topic more thoroughly than most of those discussions.
I guess it was pretty early on when I decided to embrace the NULL. It's been easy enough to handle them in the application, and in a facade when I'm not responsible for the application.
In one case, we got to the end of a project when a last minute requirement popped in requiring certain entities to own other entities. We added a nullable FK to the entity being owned, referencing the owner. Since the application was interacting with the database via SP's, we were able get the database ready in an hour or two. That allowed us to spend the little time we had on the handful of minor changes to the application.
I suppose that anecdote may support the use of SP's more than that of NULL's, but I still dig NULL's
An important meaning of nulls in the context of nullable columns is “value at present unknown” - the "A mark" null. The "I mark" null is arguably not especially well supported in general, as SQL effectively co-mingles both "types" of nulls. As the author alludes to with regard to tables with nullable columns, it is not unusual to encounter circumstances that may call for taking care with queries to correctly understand and summarize data i.e.(relative to similar tables without nullable columns), and that is perhaps the most obvious cost of NULLs.
A point of some critics of NULLs that appears to be lost here however, is that a DBMS implementation that did not implement NULLs i.e.(supporting a "new" SQL say) would need to handle issues typically handled with NULLs in another (and perhaps better) matter. Another point that may be worth raising is that if NULLs are to be supported for use as either A marks or as I marks, there is at least one product (FirstSql) that arguably supports NULLs "better" than most.
Thank you for addressing this issue Michael.
Fundamentally, NULL values indicate a fundamental flaw in tuple (noun) definition. SQL Server's treatment of NULL equality can be unexpected for those who work with other databases.
It’s impossible to avoid NULLs in many cases, specifically because of SQL Server's product position. It’s often the platform of choice for ETL. Therefore, many SQL ODS databases can suffer from "NULL" sickness because of a source system's poor table design.
Michael does a good job of explaining the "real life so deal with it" issues.
"On the other hand, there are those who would recommend a less dramatic method of dealing with NULL, namely: "learn how to use them properly, and minimize their use where practical."
WOW FINALLY, Someone to voice and understand the true need, and nature of NULL.
I have long argued that there are real world needs for null when the data is truly not known. As an auction company we take in 10's of thousands of items each year, and write detailed condition reports on them. We also create highly detailed attribute lists as part of the condition report.
So as an example if we created defaults of say "No" for an attribute like "Air Conditioning", it could lead to potentially damaging results.
If the condition report writer failed to check the A/C, there would be nothing to alert anyone that the attribute had been missed, as it would look like the asset does not have A/C. If the item does in fact have A/C and the writer simply missed it the end result would be a costly devaluation of the asset. In this case NULL, is valuable, and with 148 distinct attributes for a Truck Tractor or Trailer, as an example, I can only imagine (or rather can't imagine) the nightmare of maintaining and querying this in 6NF. While it is true that this could be handled in the interface by say not allowing the input of a condition report until all the questions are answered, there are distinct and real business reasons why you would not want do that, so null becomes the best solution. The only other course is to set the defaults to "Unanswered" but, then what about numeric fields?
Clearly; when appropriate, defaults and normalization should be used, but I think it was quite elegantly proven in this article that when used thoughtfully and with careful design NULL is a very powerful tool!