Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««12345»»»

Why is data integrity important? Expand / Collapse
Author
Message
Posted Monday, August 30, 2010 11:10 AM
Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Thursday, July 17, 2014 12:34 PM
Points: 1,414, Visits: 4,539
i've seen instances where most trouble tickets are simple DML to fix errors the application or people made in entering data. sales people are notorious for submitting incomplete or wrong paperwork that requires a lot of work to fix

https://plus.google.com/100125998302068852885/posts?hl=en
http://twitter.com/alent1234
x-box live gamertag: i am null
[url=http://live.xbox.com/en-US/MyXbox/Profile[/url]
Post #977460
Posted Monday, August 30, 2010 11:20 AM


Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Friday, November 11, 2011 1:21 PM
Points: 50, Visits: 162
Steve Jones - Editor (8/30/2010)
I've always seen "N/A" as "Not applicable", but you give some good other responses that might be used. Hence another point for the "non-nulls" argument.


I have dealt with DBA's who insisted on all columns being non-NULL and I have engaged in "disucssions" about such (both in person and in cyberspace) and I have yet to find a really good argument for changing my position: NULLs should be used when appropriate and avoided when not apporpriate.

If there is a referential integrity rule that, in effect, says that every row in TableA has to have a FK/PK link to TableB, then it is inappropriate for that column in TableA to be NULL. However, if there is a column, for instance, in a Hospital Information System's PatientDetails table, that is supposed to contain a Date of Birth and it is possible that a) that information may notbe readily available (e.g. John Doe is unconscious when admitted) and b) it is not a show stopper if it either is incorrect or unavailable, I would rather have that column in that row left NULL than to assign an arbitrary DOB. If the column is not essential to the overall integrity of that row, when I am faced with the choice of using an arbitrary default value or leaving the column NULL, I choose the NULL.


Ralph D. Wilson II
Development DBA

"Give me 6 hours to chop down a tree and I will spend the first 4 sharpening the ax."
A. Lincoln
Post #977468
Posted Monday, August 30, 2010 11:25 AM
SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Thursday, May 15, 2014 8:31 AM
Points: 419, Visits: 725
There is always a time and place for NULLs. Think "Address Line 2" type fields. Chances are, the value in those type of fields are NULL more often than their not. But I woulnd't want a NULL in "Address Line 1", or company name, or...
Post #977470
Posted Monday, August 30, 2010 11:32 AM
Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Thursday, July 17, 2014 12:34 PM
Points: 1,414, Visits: 4,539
i've seen this no NULL's allowed OCD thing being taken to extremes

say you sign up a customer today and promise some service that will be done in the near future but you're not sure of the date. since the column is not null you have to put in a crazy date like 1/1/2049. or if something was done in the past then 1/1/1901

so if a customer calls in a few days the CSR will happily tell them they can expect to see someone come to their site in 2049


https://plus.google.com/100125998302068852885/posts?hl=en
http://twitter.com/alent1234
x-box live gamertag: i am null
[url=http://live.xbox.com/en-US/MyXbox/Profile[/url]
Post #977472
Posted Monday, August 30, 2010 11:34 AM
SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Thursday, May 15, 2014 8:31 AM
Points: 419, Visits: 725
It's as they say: no matter how good a thing is, there is always such a thing as too much of it.
Post #977475
Posted Monday, August 30, 2010 12:24 PM


SSC-Insane

SSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-Insane

Group: General Forum Members
Last Login: Yesterday @ 11:08 PM
Points: 21,385, Visits: 9,602
alen teplitsky (8/30/2010)
i've seen this no NULL's allowed OCD thing being taken to extremes

say you sign up a customer today and promise some service that will be done in the near future but you're not sure of the date. since the column is not null you have to put in a crazy date like 1/1/2049. or if something was done in the past then 1/1/1901

so if a customer calls in a few days the CSR will happily tell them they can expect to see someone come to their site in 2049


Hehe, I've worked for them .
Post #977507
Posted Monday, August 30, 2010 12:28 PM


SSC-Insane

SSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-Insane

Group: General Forum Members
Last Login: Yesterday @ 11:08 PM
Points: 21,385, Visits: 9,602
RalphWilson (8/30/2010)
Steve Jones - Editor (8/30/2010)
I've always seen "N/A" as "Not applicable", but you give some good other responses that might be used. Hence another point for the "non-nulls" argument.


I have dealt with DBA's who insisted on all columns being non-NULL and I have engaged in "disucssions" about such (both in person and in cyberspace) and I have yet to find a really good argument for changing my position: NULLs should be used when appropriate and avoided when not apporpriate.

If there is a referential integrity rule that, in effect, says that every row in TableA has to have a FK/PK link to TableB, then it is inappropriate for that column in TableA to be NULL. However, if there is a column, for instance, in a Hospital Information System's PatientDetails table, that is supposed to contain a Date of Birth and it is possible that a) that information may notbe readily available (e.g. John Doe is unconscious when admitted) and b) it is not a show stopper if it either is incorrect or unavailable, I would rather have that column in that row left NULL than to assign an arbitrary DOB. If the column is not essential to the overall integrity of that row, when I am faced with the choice of using an arbitrary default value or leaving the column NULL, I choose the NULL.


One argument that can hardly be touched : no matter how you deal with it in the db, you'll have to do work in the application to handle cases... or waste time training people so that 2049 is not the date we are planning to visit to do the work.

If you decide to somehow ban all nulls in te db by adding tables, then you'll get all those nulls back as soon as you query anything because of all the left joins you'll have to perform.

Post #977510
Posted Monday, August 30, 2010 12:35 PM
SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Friday, July 25, 2014 2:49 PM
Points: 804, Visits: 1,989
So, Steve, what is the definition of "N/A"? Is it "Not Applicable" or "Not Available" or "No Answer"?


Just because there are multiple possibilities does not invalidate the use of Nulls. Sometimes all of the answers are possible, hence the reason some people advocate the creation of multiple types of NULLs. If N/A could be put in an integer or a date field, then someone could make a better case against nulls. Until that time, however...

In OLAP designs, there are no NULL FKs because OLTP Nulls can be accounted for with a dimension row. These have two possible rows, however, one for N/A (Not Applicable) and one for Unknown, because there should be an answer. An incomplete order will not have a completion date, and will appear in the N/A column for Completed Date. A completed order that doesn't have a complete date, however, will be in the Unknown column, because it should have a complete date. A missing Create Date will always be Unkown, because a created item should have one of these.

This is possible in an OLAP design the date field can (and should) be masked with a text field, however, so it's possible use 12/31/69 for N/A and 1/1/70 for unknown, because the user will never see these. This would be awkward in an OLTP design.

While we are on the subject of OLAP designs, there are average calculations that would not work w/o the use of Nulls because that's the best way to omit from a calculation rows that have nothing to do with that particular calculation.



Post #977515
Posted Monday, August 30, 2010 12:41 PM
SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Thursday, May 15, 2014 8:31 AM
Points: 419, Visits: 725
I actually do the same thing with my data warehouse, having seperate keys for N/A and Unknown. If there really wasn't a part sold on an order line, then the part is N/A. If I have a customer without a customer type, it is Unknown. The difference is that there should be a customer type, but isn't. There doesn't have to be a part.

However, since CustomerType is an attribute and not a dimension itself, and the source system allows NULLs, I originally had the field as a nullable column. This caused a lot of problems with groupings and filters, most of which were due to the old NULL <> NULL rule. It was then that all the NULLs were replaced with Unknown.
Post #977518
Posted Monday, August 30, 2010 12:52 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: Administrators
Last Login: Today @ 12:50 PM
Points: 33,073, Visits: 15,187
RonKyle (8/30/2010)

Just because there are multiple possibilities does not invalidate the use of Nulls.


I would agree, but it does make an argument that NULLS can be problematic. I think NULLs can work well in an application.

Great point on the OLAP, ignore rows in a calculation, comment.







Follow me on Twitter: @way0utwest

Forum Etiquette: How to post data/code on a forum to get the best help
Post #977527
« Prev Topic | Next Topic »

Add to briefcase ««12345»»»

Permissions Expand / Collapse