Database Design Follies: NULL vs. NOT NULL

  • very simple solution to all the nonsense.

    MAKE ALL COLUMNS NOT NULLABLE.

    If you can't figure out how to design a schema where that works for your business requirements, you can insist that application code insert defaults if you have to have a column in a table where you allow inserts before having a value.

    I resist the temptation to allow nullable columns in my designs, and I never regret the decision. I have seen huge data warehouse/analytic code bases, hundreds of thousande of lines of code, where the majority of the code was checking each column for nulls before doing it's actual work!

    How dumb!

  • kenambrose (10/30/2014)


    very simple solution to all the nonsense.

    MAKE ALL COLUMNS NOT NULLABLE.

    If you can't figure out how to design a schema where that works for your business requirements, you can insist that application code insert defaults if you have to have a column in a table where you allow inserts before having a value.

    I resist the temptation to allow nullable columns in my designs, and I never regret the decision. I have seen huge data warehouse/analytic code bases, hundreds of thousande of lines of code, where the majority of the code was checking each column for nulls before doing it's actual work!

    How dumb!

    That is a fine philosophy if default values can be defined and applied to every column in question. But, what if we're looking at a foreign key/joining column, where its sole purpose is to connect related data sets? We never check for NULL because that is not the function of the data in the first place. Or, perhaps existence is the critical purpose of it and we want to check it explicitly for that scenario. Maybe we don't want to confuse zero with NULL or blank with NULL?

    Sometimes a design like this will result in more efficient code. But as always, that efficiency is a combination of the relationships between the database schema and the code. Both need to be written to efficiently maintain and retrieve data with respect to each other, otherwise a system, NULLs or no NULLs, will not be well thought out, and will certainly not run efficiently.

  • kenambrose (10/30/2014)


    very simple solution to all the nonsense.

    MAKE ALL COLUMNS NOT NULLABLE.

    If you can't figure out how to design a schema where that works for your business requirements, you can insist that application code insert defaults if you have to have a column in a table where you allow inserts before having a value.

    I resist the temptation to allow nullable columns in my designs, and I never regret the decision. I have seen huge data warehouse/analytic code bases, hundreds of thousande of lines of code, where the majority of the code was checking each column for nulls before doing it's actual work!

    How dumb!

    Interesting approach. Suppose I have a user table which includes a DateOfBirth column, but for some users d-o-b may genuinely not be known. How do we get the row inserted without nulls or magic numbers, Ken?

  • Starwatcher (10/30/2014)


    kenambrose (10/30/2014)


    very simple solution to all the nonsense.

    MAKE ALL COLUMNS NOT NULLABLE.

    If you can't figure out how to design a schema where that works for your business requirements, you can insist that application code insert defaults if you have to have a column in a table where you allow inserts before having a value.

    I resist the temptation to allow nullable columns in my designs, and I never regret the decision. I have seen huge data warehouse/analytic code bases, hundreds of thousande of lines of code, where the majority of the code was checking each column for nulls before doing it's actual work!

    How dumb!

    Interesting approach. Suppose I have a user table which includes a DateOfBirth column, but for some users d-o-b may genuinely not be known. How do we get the row inserted without nulls or magic numbers, Ken?

    Easy; you use a fixed value for "don't know", such as 9999-12-31.

    Then your huge data warehouse codebase will no longer spend the majority of the code checking for NULL, instead, it'll check for 9999-12-31 and the other "don't know" values! :crazy:

    More seriously, NULL is fine if a given column has exactly one meaning for NULL. My problem with NULL is primarily that when someone uses NULL to mean more than one thing (for instance, both never collected AND doesn't exist, to use the address2 example)

  • I feel there are a number of good options.

    1. Personally, I feed magic numbers are STILL preferable over nulls. BUT I don't recommend them typically. ("default" value input by application layer when row is created)

    2: Put nullable columns into a related table. No data? No row. (schema design)

    3: Pressure the database vendors to support relational model. i.e. support for user domains with user defined operations. Then you can have a datatype called "FirmNameBirthDate" that would support value of "unknown" plus valid date values, WITH correct datemath where the date is valid.

    Obviously 3 is not an immediate solution, but in a world of database practitioners that don't even understand relational theory- and the practical consequences of database systems that don't support it- why would we expect the vendors to invest effort to support it?

  • [ instead, it'll check for 9999-12-31 and the other "don't know" values! Crazy]]

    Not necessary at all.

    Simply create a view that either eliminates rows with nulls using a where clause as opposed to isnull function called for every row (my recommendation), or replaces nulls with some value that will not require correction by the application code.

    And if you need the view to be updateable even though it has calculated columns, add an instead of trigger to the view for inserts and updates (speaking from experience that has worked well for me).

  • ZZartin (10/30/2014)


    Robert Mark (10/30/2014)


    I don't know how anyone using an application and seeing a NULL value on the screen would equate the NULL value to anything other than a zero. I would default a new account setup with a zero value for the balance. I don't see how a NULL value should ever be presented to a user for a balance. To me that is a bug that must be corrected. And if you are coding the application and you get a null value for the account balance, you are going to show a zero value on the screen for the user, not a NULL. So, by allowing a NULL value for a balance to begin with in the database, you are just adding more coding to the application.

    Again, if it is an aggregation or an outer join that is causing the NULL value you should either handle the Null value in the query or add code the application for those specific conditions. With the databases I design and use in applications, I never want to see a NULL value get read in by application code.

    /shrug It really depends, think about an application that requires certain values to be entered before other actions can be taken on a record but the user might want to save that record and come back to it later. Do you have the application check for whatever default value you picked, which in the case of a balance might not have an invalid value, or NULL as the check whether a required field has been filled out?

    NULL should be an account that is initiated but not yet available. Maybe additional background info is required. For example, a HELOC that is waiting on one final approval. A balance of 0 means the account fully exists and is likely active; a balance of NULL means the account has been created, i.e. an account number assigned, but is not ready for use yet. Maybe it's two more business days before the LOC becomes available for use.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • Starwatcher (10/30/2014)


    kenambrose (10/30/2014)


    very simple solution to all the nonsense.

    MAKE ALL COLUMNS NOT NULLABLE.

    If you can't figure out how to design a schema where that works for your business requirements, you can insist that application code insert defaults if you have to have a column in a table where you allow inserts before having a value.

    I resist the temptation to allow nullable columns in my designs, and I never regret the decision. I have seen huge data warehouse/analytic code bases, hundreds of thousande of lines of code, where the majority of the code was checking each column for nulls before doing it's actual work!

    How dumb!

    Interesting approach. Suppose I have a user table which includes a DateOfBirth column, but for some users d-o-b may genuinely not be known. How do we get the row inserted without nulls or magic numbers, Ken?

    Unrealistic, because often data in the real world is not as simple as you want it to be. Rather than arbitrarily and often falsely removing the NULLs, instead, if you prefer, use views or even computed columns to hide the NULLs from the vast majority of developers.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • Robert Mark (10/30/2014)


    I don't know how anyone using an application and seeing a NULL value on the screen would equate the NULL value to anything other than a zero. I would default a new account setup with a zero value for the balance. I don't see how a NULL value should ever be presented to a user for a balance. To me that is a bug that must be corrected. And if you are coding the application and you get a null value for the account balance, you are going to show a zero value on the screen for the user, not a NULL. So, by allowing a NULL value for a balance to begin with in the database, you are just adding more coding to the application.

    Again, if it is an aggregation or an outer join that is causing the NULL value you should either handle the Null value in the query or add code the application for those specific conditions. With the databases I design and use in applications, I never want to see a NULL value get read in by application code.

    But the data is what is critical and valuable, not the app code! A business's top priority in this area must be store accurate data. If current developers can't handle that, get different developers!

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • NULL seems to be encouraged when working with SSAS. While you can default values to mean not present (i.e. -1 surrogate keys), many of the default features/function within SSAS rely on the presence of NULL. I've been involved in the design of a large DW where we painstakingly made sure nothing was NULLABLE...only to have to build views for the cubes using NULLIF everywhere. While we were going to build views anyways, we found changing some of the default behaviors in SSAS can be taxing and a maintenance nightmare.

    Aigle de Guerre!

  • From my understanding - believe I read this in some SQL internals book 10 years ago - when NULL's are supported on a table then each row in that table will have a bit-pattern that indicates which columns in that ROW are null.

    The added expense of processing this "is-the-column-null" bit-pattern and the extra size of the bit pattern are the performance hits of allowing NULL's.

    imo - probably meaningless in the overall time used to process queries.

  • An important aspect of the "unknown" nature of NULL in SQL Server has been overlooked in the article and this forum. SQL Server looks at the column or variable with NULL assigned like a black box. Inside that black box there may be a value or there might not be a value, and if there is a value SQL Server can't know what it is.

    In most conditional statements, like WHERE clauses and IF statements, the conditional expression will return TRUE if and only if SQL Server considers the assertion provably true. When one of the players in a conditional expression is a NULL, SQL Server says it can not prove anything, and it returns false.

    The one big exception is a CHECK CONSTRAINT. With a CHECK CONSTRAINT the assertion just has to be not provably FALSE. Say we have [font="Courier New"]... MyEvenInteger Int NULL CHECK MyEvenInteger % 2 = 0[/font]. This will allow NULL, because SQL Server will say that the content of the black box *could* be even, so it cannot prove the assertion false.

    This would allow rows with NULL, but if we have a query with [font="Courier New"]...WHERE MyEvenInteger % 2 = 0[/font], the query will not return the rows with NULL.

    Sincerely,
    Daniel

  • An important aspect of the "unknown" nature of NULL in SQL Server has been overlooked in the article and this forum. SQL Server looks at the column or variable with NULL assigned like a black box. Inside that black box there may be a value or there might not be a value, and if there is a value SQL Server can't know what it is.

    In most conditional statements, like WHERE clauses and IF statements, the conditional expression will return TRUE if and only if SQL Server considers the assertion provably true. When one of the players in a conditional expression is a NULL, SQL Server says it can not prove anything, and it returns false.

    The one big exception is a CHECK CONSTRAINT. With a CHECK CONSTRAINT the assertion just has to be not provably FALSE. Say we have [font="Courier New"]... MyEvenInteger Int NULL CHECK MyEvenInteger % 2 = 0[/font]. This will allow NULL, because SQL Server will say that the content of the black box *could* be even, so it cannot prove the assertion false.

    This would allow rows with NULL, but if we have a query with [font="Courier New"]...WHERE MyEvenInteger % 2 = 0[/font], the query will not return the rows with NULL.

    Sincerely,
    Daniel

  • "Any WHERE clause that checks if column_name = NULL will return no rows, such as in this example."

    Not needs qualified to include considering setting ANSI_Nulls

    create table #TestIt(MyValue Char(1), MyNullValue char(1))

    insert into #TestIt(MyValue) Values('X')

    -- This does not return a row

    set ANSI_Nulls on

    select * from #TestIt where MyNullValue = null

    -- This returns a row

    set ANSI_Nulls off

    select * from #TestIt where MyNullValue = null

    -- This does not return a row

    set ANSI_Nulls on

    select * from #TestIt where MyNullValue = null

  • Not taking the care and time to figure out whether a column should allow nulls or not can cause grief. Take the pain now when your define the table.

Viewing 15 posts - 16 through 30 (of 137 total)

You must be logged in to reply to this topic. Login to reply