The title says it all: who Likes NULL values in their tables?
I have tended to allow NULLs in quite a few places in my design, often because I view the world as messy and incomplete. I also find that applications are faulty, and might not validate data, might not run long enough without a crash to let a user insert a lot of data. The application might mangle data, or just might not have been updated to support a new column of data. I've found that there are times where I accept the messy real world and use NULL to represent unknown values.
Dr. Low notes this as well in a recent post. His view is similar to mine in that he uses NULL values when we don't know the actual data. This is preferable to some magic value that has to be coded in every application using the database. There are too many chances of mistakes, and definitely the possibility of leakage for these magic values.
As we use more and reporting and aggregation tools, users may inadvertently see strange values exposed. Many of these tools wouldn't be coded to translate magic values to some agreed upon value, which results in confusion and distraction for clients. The data in our systems becomes used in new and different ways as we start to connect new applications to existing databases. We may also use ETL processes to move information among systems, often to data warehouse or OLAP data stores. Often there are proof of concept prototypes built with self-service tools, such as Power BI, and the logic that was originally coded to translate magic values is lost.
That doesn't mean that every field should allow NULLs, but that we should consider them in places where the data is useful, but not necessarily mandated or captured in every transactions. If we have valid defaults, use them, but if not, don't be afraid of NULL. Understand the meaning and implications of allowing NULLs and use them carefully.
I'm curious about if you agree with me. Do you default to NULL values or do you avoid them at all costs? Do you use them judiciously? Give me the reasons why or why not, and if you have examples of where you allow NULLs, let us know.
"...Distribution statistics are kept for columns and indexes. They indicate the selectivity of the values in a particular index or column. For example, in a table representing cars, many cars have the same manufacturer, but each car has a unique vehicle identification number (VIN). An index on the VIN is more selective than an index on the manufacturer."
This newsletter was sent to you because you signed up at SQLServerCentral.com.
Feel free to forward this to any colleagues that you think might be interested.
If you have received this email from a colleague, you can register to receive it here.