SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Why is data integrity important?


Why is data integrity important?

Author
Message
alen teplitsky
alen teplitsky
SSChampion
SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)

Group: General Forum Members
Points: 12402 Visits: 4674
i've seen instances where most trouble tickets are simple DML to fix errors the application or people made in entering data. sales people are notorious for submitting incomplete or wrong paperwork that requires a lot of work to fix
RalphWilson
RalphWilson
SSC-Enthusiastic
SSC-Enthusiastic (198 reputation)SSC-Enthusiastic (198 reputation)SSC-Enthusiastic (198 reputation)SSC-Enthusiastic (198 reputation)SSC-Enthusiastic (198 reputation)SSC-Enthusiastic (198 reputation)SSC-Enthusiastic (198 reputation)SSC-Enthusiastic (198 reputation)

Group: General Forum Members
Points: 198 Visits: 162
Steve Jones - Editor (8/30/2010)
I've always seen "N/A" as "Not applicable", but you give some good other responses that might be used. Hence another point for the "non-nulls" argument.


I have dealt with DBA's who insisted on all columns being non-NULL and I have engaged in "disucssions" about such (both in person and in cyberspace) and I have yet to find a really good argument for changing my position: NULLs should be used when appropriate and avoided when not apporpriate.

If there is a referential integrity rule that, in effect, says that every row in TableA has to have a FK/PK link to TableB, then it is inappropriate for that column in TableA to be NULL. However, if there is a column, for instance, in a Hospital Information System's PatientDetails table, that is supposed to contain a Date of Birth and it is possible that a) that information may notbe readily available (e.g. John Doe is unconscious when admitted) and b) it is not a show stopper if it either is incorrect or unavailable, I would rather have that column in that row left NULL than to assign an arbitrary DOB. If the column is not essential to the overall integrity of that row, when I am faced with the choice of using an arbitrary default value or leaving the column NULL, I choose the NULL.

Ralph D. Wilson II
Development DBA

"Give me 6 hours to chop down a tree and I will spend the first 4 sharpening the ax."
A. Lincoln
jvanderberg
jvanderberg
SSCrazy
SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)

Group: General Forum Members
Points: 2914 Visits: 746
There is always a time and place for NULLs. Think "Address Line 2" type fields. Chances are, the value in those type of fields are NULL more often than their not. But I woulnd't want a NULL in "Address Line 1", or company name, or...
alen teplitsky
alen teplitsky
SSChampion
SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)

Group: General Forum Members
Points: 12402 Visits: 4674
i've seen this no NULL's allowed OCD thing being taken to extremes

say you sign up a customer today and promise some service that will be done in the near future but you're not sure of the date. since the column is not null you have to put in a crazy date like 1/1/2049. or if something was done in the past then 1/1/1901

so if a customer calls in a few days the CSR will happily tell them they can expect to see someone come to their site in 2049
jvanderberg
jvanderberg
SSCrazy
SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)

Group: General Forum Members
Points: 2914 Visits: 746
It's as they say: no matter how good a thing is, there is always such a thing as too much of it.
Ninja's_RGR'us
Ninja's_RGR'us
SSC Guru
SSC Guru (116K reputation)SSC Guru (116K reputation)SSC Guru (116K reputation)SSC Guru (116K reputation)SSC Guru (116K reputation)SSC Guru (116K reputation)SSC Guru (116K reputation)SSC Guru (116K reputation)

Group: General Forum Members
Points: 116446 Visits: 9672
alen teplitsky (8/30/2010)
i've seen this no NULL's allowed OCD thing being taken to extremes

say you sign up a customer today and promise some service that will be done in the near future but you're not sure of the date. since the column is not null you have to put in a crazy date like 1/1/2049. or if something was done in the past then 1/1/1901

so if a customer calls in a few days the CSR will happily tell them they can expect to see someone come to their site in 2049


Hehe, I've worked for them w00t.
Ninja's_RGR'us
Ninja's_RGR'us
SSC Guru
SSC Guru (116K reputation)SSC Guru (116K reputation)SSC Guru (116K reputation)SSC Guru (116K reputation)SSC Guru (116K reputation)SSC Guru (116K reputation)SSC Guru (116K reputation)SSC Guru (116K reputation)

Group: General Forum Members
Points: 116446 Visits: 9672
RalphWilson (8/30/2010)
Steve Jones - Editor (8/30/2010)
I've always seen "N/A" as "Not applicable", but you give some good other responses that might be used. Hence another point for the "non-nulls" argument.


I have dealt with DBA's who insisted on all columns being non-NULL and I have engaged in "disucssions" about such (both in person and in cyberspace) and I have yet to find a really good argument for changing my position: NULLs should be used when appropriate and avoided when not apporpriate.

If there is a referential integrity rule that, in effect, says that every row in TableA has to have a FK/PK link to TableB, then it is inappropriate for that column in TableA to be NULL. However, if there is a column, for instance, in a Hospital Information System's PatientDetails table, that is supposed to contain a Date of Birth and it is possible that a) that information may notbe readily available (e.g. John Doe is unconscious when admitted) and b) it is not a show stopper if it either is incorrect or unavailable, I would rather have that column in that row left NULL than to assign an arbitrary DOB. If the column is not essential to the overall integrity of that row, when I am faced with the choice of using an arbitrary default value or leaving the column NULL, I choose the NULL.


One argument that can hardly be touched : no matter how you deal with it in the db, you'll have to do work in the application to handle cases... or waste time training people so that 2049 is not the date we are planning to visit to do the work.

If you decide to somehow ban all nulls in te db by adding tables, then you'll get all those nulls back as soon as you query anything because of all the left joins you'll have to perform.
RonKyle
RonKyle
SSChampion
SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)

Group: General Forum Members
Points: 11896 Visits: 3748
So, Steve, what is the definition of "N/A"? Is it "Not Applicable" or "Not Available" or "No Answer"?


Just because there are multiple possibilities does not invalidate the use of Nulls. Sometimes all of the answers are possible, hence the reason some people advocate the creation of multiple types of NULLs. If N/A could be put in an integer or a date field, then someone could make a better case against nulls. Until that time, however...

In OLAP designs, there are no NULL FKs because OLTP Nulls can be accounted for with a dimension row. These have two possible rows, however, one for N/A (Not Applicable) and one for Unknown, because there should be an answer. An incomplete order will not have a completion date, and will appear in the N/A column for Completed Date. A completed order that doesn't have a complete date, however, will be in the Unknown column, because it should have a complete date. A missing Create Date will always be Unkown, because a created item should have one of these.

This is possible in an OLAP design the date field can (and should) be masked with a text field, however, so it's possible use 12/31/69 for N/A and 1/1/70 for unknown, because the user will never see these. This would be awkward in an OLTP design.

While we are on the subject of OLAP designs, there are average calculations that would not work w/o the use of Nulls because that's the best way to omit from a calculation rows that have nothing to do with that particular calculation.



jvanderberg
jvanderberg
SSCrazy
SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)

Group: General Forum Members
Points: 2914 Visits: 746
I actually do the same thing with my data warehouse, having seperate keys for N/A and Unknown. If there really wasn't a part sold on an order line, then the part is N/A. If I have a customer without a customer type, it is Unknown. The difference is that there should be a customer type, but isn't. There doesn't have to be a part.

However, since CustomerType is an attribute and not a dimension itself, and the source system allows NULLs, I originally had the field as a nullable column. This caused a lot of problems with groupings and filters, most of which were due to the old NULL <> NULL rule. It was then that all the NULLs were replaced with Unknown.
Steve Jones
Steve Jones
SSC Guru
SSC Guru (248K reputation)SSC Guru (248K reputation)SSC Guru (248K reputation)SSC Guru (248K reputation)SSC Guru (248K reputation)SSC Guru (248K reputation)SSC Guru (248K reputation)SSC Guru (248K reputation)

Group: Administrators
Points: 248460 Visits: 19807
RonKyle (8/30/2010)

Just because there are multiple possibilities does not invalidate the use of Nulls.


I would agree, but it does make an argument that NULLS can be problematic. I think NULLs can work well in an application.

Great point on the OLAP, ignore rows in a calculation, comment.

Follow me on Twitter: @way0utwest
Forum Etiquette: How to post data/code on a forum to get the best help
My Blog: www.voiceofthedba.com
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum







































































































































































SQLServerCentral


Search