Four Rules for NULLs

  • Hello folks,

    Is null a value or simply a flag. If it is a flag then why DBMS doesn't handle it rather then giving its control to users?

  • NULL is not a value, it is a special mark (or flag) indicating the absence of a value.  Not sure about the second part of your question - in many instances the DBMS does 'handle' it.  In the case of aggregate functions, for instance, SQL Server 'handles' it by ignoring it.

    Because NULL indicates an absence of value, it requires special handling by SQL Server and SQL developers.  It also requires special handling by developers of middleware and front-end solutions that rely on SQL Server to store and retrieve data.  The special handling might be as simple as ignoring NULLs using COALESCECASE, or WHERE col IS NOT NULL on the back end; or IF...THEN...ELSE... style flow-control structures on the front-end.  Of course, if you're going to just explicitly ignore all NULLs, the question becomes why even allow NULLs?

    What you do with NULLs when you encounter them is defined as part of your business logic, and needs to be decided up-front; often on a column-by-column basis.  For instance, a NULL in a middle name column will probably be handled differently in your business logic than a NULL in an item cost column or a date ordered column.

  • Early on in the development of the SQL:1999 ANSI & ISO standard, there was a concept of user-defined NULL types. The idea was to allow up to 128 different types of NULLs.  One then needed a mechanism for specifying which NULL type, and comparing two NULL types to see if they were the same type of NULL.  The concept was very powerfull from a database design standpoint, but very complex to specify in the standard.  There was no indication that any of the vendors were ever likely to implement the concept, so it was eventually weeded out.

    FIPS 127-2 was largely written by Len Gallagher.  Len was NIST representative to the ANSI SQL committee (then X3H2) and served as the convenor (I think the title at the time was rappatour) of the ISO SQL group in the late 1980's to 1995.

    The great thing about NULLs is that you can spend hours talking about the meaning of nothing.

    Keith Hare

     

     


    Cordially,

    Keith W. Hare

  • Once again, great article, and great discussions and feedback.  Just curious (and no 'curiosity killed the cat' replies please :-), is there a way to test for Unknown? 
     
    I have found very little info about handling or working with Unknown.  I know, as I type this it does sound a little foolish...
     
    Thanks,
    Tom
  • Ahh, trying to find the great "Unknown", very philosophical Tom.

    I think many comments have alluded to the idea that that is the only true purpose of using NULL (other than some conspiracy to confuse hot dog vendors).

    On a not so philosophical nature; if you want to search for unknow values you should consider using "NOT IN" in your WHERE clause or a <> join.

    Max

  • Hi Tom, thanks for the comments.

    The main method I can think of for testing of UNKNOWNs is to perform three comparisons in SQL-92:

    IF x = y THEN   -- perform if x = y is TRUE

    ...

    ELSE IF NOT x = y THEN  -- perform if x = y is FALSE

    ...

    ELSE  -- perform if x = y is UNKNOWN

    SQL-99 offers BOOLEAN data types, with three BOOLEAN literals:  TRUE, FALSE, UNKNOWN.  Presumably this will make testing for UNKNOWN values easier, but it doesn't look like this was implemented in SQL 2K5.  (Someone please correct if I am wrong here, but I can't find it...)

  • On a not so philosophical nature; if you want to search for unknow values you should consider using "NOT IN" in your WHERE clause or a <> join.

    Both of which, by the way, are non-SARGable in WHERE clauses...

  • Thanks for once again dropping some inside knowledge on us Keith   For those of us who get a headache from 3-valued logic, just try to imagine 130-valued *user-defined* logic.  Ouch.

    BTW, when can I expect a copy of the SQL99 and SQL2003 standards in my inbox?  LOL.

  • Mike, good point about NOT IN not being SARGable in the WHERE clause. However, using "three comparisons in SQL-92" (IF...THEN...ELSE) would certainly not be very effecient for multiple row data sets comparison. By using covering indexes it is possible that using a <> JOIN would be quite effecient.

    I don't want to get into a scrap about methods here, suffice to say that there are many ways to find a solution, depending on the circumstances one may certainly be better than the other.

    Max

  • Yes, I should have added that the IF ... THEN ... ELSE method was not usable in a WHERE clause to my statement. I think the SQL99 standard looks like it probably offers the most intuitive method of checking for UNKNOWN, with its Boolean literals. Unfortunately I still can't find anything indicating this has been implemented in SQL 2K5

  • Hi Mike

    Here's an explanation on using UNKNOWN with SQL2003: http://msdn2.microsoft.com/en-us/library/ms188074.aspx

    To test this I created a table:

    CREATE TABLE t1([a] [int] NULL)

    GO

    INSERT INTO t1 (a) VALUES (NULL)

    INSERT INTO t1 (a) VALUES (1)

    and ran the following script:

    DECLARE @tinNull tinyint

    DECLARE cur CURSOR FOR

     SELECT a

     FROM t1

    OPEN cur

    FETCH NEXT FROM cur INTO @tinNull

    WHILE @@FETCH_STATUS = 0

    BEGIN

     PRINT ISNULL(CAST(@tinNull AS VARCHAR(10)), 'NULL')

     SET ANSI_NULLS ON 

     PRINT ' ANSI_NULLS ON'

     

     IF @tinNull = NULL PRINT ' = NULL'

     IF @tinNull IS NULL PRINT ' IS NULL'

     IF ISNULL(CAST(@tinNull AS VARCHAR(10)), 'NULL') = 'NULL' PRINT ' ISNULL comparisons possible'

     IF @tinNull = 1 PRINT ' TRUE'

     

     PRINT ''

     

     SET ANSI_NULLS OFF

     PRINT ' ANSI_NULLS OFF'

     

     IF @tinNull = NULL PRINT ' = NULL'

     IF @tinNull IS NULL PRINT ' IS NULL'

     IF ISNULL(CAST(@tinNull AS CHAR(10)), 'NULL') = 'NULL' PRINT ' ISNULL comparisons possible'

     IF @tinNull = 1 PRINT ' TRUE'

     

     PRINT '-------------------------------------'

     FETCH NEXT FROM cur INTO @tinNull

    END

    CLOSE cur

    DEALLOCATE cur

    The most interesting thing is in the loop of the cursor when it picks up IS NULL but not = NULL for ANSI_NULLS ON but ANSI_NULLS OFF is obviously different; the comparison returns true when = NULL.

    Max

  • Thanks for the link Max.  I don't like the fact that you can't declare a BOOLEAN variable and compare it against Boolean Literals (TRUE, FALSE, UNKNOWN), like in the following fake code [before anyone out there points it out, this code will *not* work]:

    DECLARE @result BOOLEAN;

    -- ...

    IF @result = UNKNOWN ...

    It'd be nice if they added this plus the ability to assign the result of logical comparison operations directly to a BOOLEAN variable, a'la VB, C++, etc.:

    SELECT @result = (@x = @y);

    IF @result = UNKNOWN ...

    This would be pretty useful and a lot more intuitive than most current methods of checking for UNKNOWN.

  • Nicely done, Mike.  

    I'm one of those who after 25 years (and several ports) have come to feel that C J Date has it right - NULLs really wreak havoc and probably should never have been allowed [as many of us know - the only real disagreement between Codd and Date]. 

    BUT!  since they ARE allowed, and I have to maintain what I wrote 15 years ago, and support the third party vendors who aren't members of MY "one true way to design" - You Need To Understand This Stuff.   I can't ignore NULLs just because I've personally sworn off them.

    You've done a great job of explaining the complexities and the pitfalls, and unlike most you show that the "three value logic" is the heart of the problem. 

    Let me mention that if you have ported code from Britton-Lee to Sybase to MS SQL Server,  the lack of common agreement between standards and vendors through the years is reason enough to avoid them going forward.

    It's like changing the rules of arithmetic every time you port - "oh - on THIS system 1+1 is 11, not 2 - and on THAT system 1+1 is 1 - not 2 or 11"

    Roger Reid

    Roger L Reid

  • Hi!

    You could also write about how NULL values effect the aggregate functions: count, sum, avg etc.

    MCs

  • Hi

    I'm not sure I agree exactly with your rule 1- or at least how you've used the idea of "unknown" to say what NULL is.

    Consider when using foreign keys..

    table EMPLOYER

    id

    name

    table person

    id

    NAME

    CURRENT_EMPLOYER_ID

    you'd constrain CURRENT_EMPLOYER_ID to EMPLOYER(ID)

    If you knew the person was unemployed you'd have to put NULL in current_employer_id

    The alternative is "special" values in your table "Not Employed".. apply that to every relation and you've got a nightmare

    In this case null doesn't mean "Unknown employer" it means we know there isn't an employer.

    Maybe this is wrong in pure terms, but I suspect that it's the more common use - especialy given the way that constraints work with NULL

    Paul

    PS yes, I agree with 4, and if ever I port to oracle I have to change isnull to nvl.. but I think that will be the least of my problems...!

Viewing 15 posts - 61 through 75 (of 152 total)

You must be logged in to reply to this topic. Login to reply