Who Likes NULL?

Question

Who Likes NULL?

Viewing 9 posts - 136 through 143 (of 143 total)

You must be logged in to reply to this topic. Login to reply

Jeff Moden SSC Guru Points: 1003863 More actions · Answer 1

Lynn Pettis - Tuesday, July 17, 2018 10:30 AM
I hope you weren't thinking that I actually was advocating for 6th normal form designs, because I wasn't. 🙂
I haven't had to design highly complex databases, the few I have written went to 3rd normal form then denormalized where appropriate.

Heh... I've actually had to design "highly complex databases" before and I find that making them highly complex is the wrong thing to do. 😀 KISS it and to me, KISS is "Keep It Super Simple".

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

mark.edwards-1115881 Valued Member Points: 56 More actions · Answer 2

YOU'RE ALL WRONG!!!! My reasons are the only good ones! Read-and-take-heed! (I know, pretty conceited, but as good as any other reason in this thread.) Pick and choose your logic.
By the way, is anyone recording all the reasons and their logic so if this question is asked again, they can just cut a paste this whole discussion and save the rest of us a lot of time and trouble?

Zidar SSChasing Mays Points: 651 More actions · Answer 3

Perhaps we need to go back to the theory, to deal with NULLs. I pologize for a bit longer post, perhaps even boring. If you are afraid of being bored, skipp to teh end and review my contribution to number threory 🙂

If we accept definition of a relation as "Set of true propositions about something, derived from the same predicate" then NULLs simply do not fit there. let's define 'proposition' and 'predicate'

Proposition = a declarative sentence S, for which we can ask "Is it true the S?" and get answer YES or NO. There is no third value, it must be either YES or NO. Mathematically strictly, there is no 3 valued logic. We can invent it, but it does not make it right. Just like division by zero does not make sense, involving third value into mathematical logic makes no sense.

Predicate = a parametrized sentence where parameters may be replaced by constants that produce a proposition.

Here is an example of a predicate:

P = Person named {Fname] lives in [City] and is born on [DOB]

Propositions derived by supplying values for bracketed parameters:

S1 = Person named [Bob] lives in [New York] and is born on [June 16 1867].
S2 = Person named [Anna] lives in [Madrid] and is born on [May 25 1989].
S3 = Person named [Thor] lives in [Oslo] and is born on [May 25 1997].
...

which can be written in a familiar tabular form:

Fname    City            DoB
----------------------------------
'Bob'    'New York'    'June 16 1867'
'Anna'    'Madrid'    'May 25 1989'
'Thor'    'Oslo'        'May 25 1997'

Rows in our table represent some propositions derived from given predicate.

In order for our table to be a relation, some conditions must be met:
1) Propositions stored in the table are all TRUE
2) Each proposition must be unique, (no identical rows in the table)
3) Values in each 'column' are from the same domain - we don't store name of the persons pet or first cousin in column 'Fname', just as we do not store persons name in City column.

We can now define a relation in a shorter for,:

Relation is a set of TRUE propositions derived from common predicate.

No tables mentioned, although it is very convenient to present relations in tabular format. This definition covers all 3 conditions.

The first one, "Relation stores only TRUE propositions" is the key in understanding why NULLs generally do not fit into relations. Simply, what is unknown cannot be true. Maybe, but not 100% sure.

Just read aloud following sentence: " Person named [Bob] lives in [New York] and is born on [NULL]". It does not sound right, does it? regardless of how it sound, we cannot declare it TRUE since NULL does not represent a valid date. "maybe" is not an answer. "maybe" does not tell us if a sentence is TRUE or FALSE. What we are actually saying by using NULL is:

" Person named [Bob] lives in [New York] and we do not know their DOB".

This is quite different meaning than other sentences we stored in our relation:

S2 = Person named [Anna] lives in [Madrid] and is born on [May 25 1989].
S3 = Person named [Thor] lives in [Oslo] and is born on [May 25 1997].

It is obvious that sentences about Anna and Thor have the same meaning. If we argue that the meaning is the same, that means we believe that words
"is born on some date" is the same thing (identical) to "we don't know their DOB". in that case, I have a mountain and a bridge to sell...

Is there a solution? There is. If we expect that DOB will not be known at the moment of inserting a row in the table, we can create a new table/relation to say
"Person [Fname] is born on [DOB] and we learned this on [InsertDate] when user [UserID] made this record.". Technically, it is not 6nf, we have more than one non-key column. But then, if something is not available immediately, we might as well record when we obtained the information and who did it.

As for situations as "active/Inactive", "StartDate/EndDate", we are forcing more than one proposition at the time. For example, predicate
"Contractor [ContactorID] signed contract [ContractID] which has status 'A' from [FromDate] to [ToDate]".
With use of NULL we may want to represent open ended contract.

Table 1::
ContractorID     ContractID         SignedDate        Status     FromDate         ToDate
---------------------------------------------------------------------------------------
125                'CNT125/1'         'May 10 2018    'A'        'July 16 2018'    'July 31, 2019'
207                '207/1DBL'         'Sept 1 2015'    'A'        'July 16 2015'    'July 31, 2017'
125                'CNT125/2'         'April 1 2018'    'A'        'July 16 2018'    NULL

Try to read aloud the predicate, by supplying values from the table. Again, row with NULL sounds awkward. There is nothing wrong with substituting NULL with '9999-01-01'. After all, it is a valid date and certainly it belongs to the domain of dates. that makes it different than, say 'N/A' or 'unknown' or 'OPEN' or - NULL.

But there is more serious problem with the design. Obviously, the table designer wants to track contracts and their active dates. Predicate talks about TWO things - a contract, and period of validity of the contracts. Perhaps better design would be:
P1: "Contractor [ContractorID] signed contract [ContractID] on date [SignedDate]."
P2: "Contract [ContractID] has status [Status] from date [StatusDate]."

Tables representing would be relations:

Table 2:
ContractorID     ContractID         SignedDate
----------------------------------------------
125                'CNT125/1'         'May 10 2018
207                '207/1DBL'         'Sept 1 2015'
125                'CNT125/2'         'April 1 2018'

Table 3::
ContractID     Status             FromDate
------------------------------------------
'CNT125/1'    'Active'        'July 16,2018'
'207/1DBL'    'Active'        'July 16,2015'
'CNT125/2'    'Active'        'July 16,2018'
'CNT125/1'    'Inactive'        'July 31, 2019'
'207/1DBL'    'Inactive'        'July 31, 2017'

there is no row for end date for contract 'CNT125/2', since we don't know it.

Not a single NULL, nor a wild-card token, no magic code, and yet we can derive same information displayed by Table 1. To display data like TAble 1 we would write a query that would return exact data-set as Table 1, NULL included. Table 1 becomes output from our database, a report if you want, so we can display NULL which can have exactly one meaning - [ToDate], expiry date fro contract 'CNT125/2' is unknown. It can be unknown only because there is no record for it in Table 2. Everything is clear. Again, it would not be wrong to add a row with a future date:

ContractID     Status             FromDate
------------------------------------------
'CNT125/1'    'Active'        'July 16,2018'
'207/1DBL'    'Active'        'July 16,2015'
'CNT125/2'    'Active'        'July 16,2018'
'CNT125/1'    'Inactive'        'July 31,2019'
'207/1DBL'    'Inactive'        'July 31,2017'
'CNT125/2'    'Active'        'June 30,2022' '<-- date 'January 1, 9999' is as valid as any date

Problem with NULL is multiple meaning 'missing, but should be there','we don't have the data yet', 'deleted maybe','we have no clue why is this empty' So we try to avoid NULLs when storing data. However, in reports NULL is kind of desirable thing because it conveys exact meaning 'this value is missing'.