Normalization

Question

Normalization

Viewing 15 posts - 31 through 45 (of 144 total)

You must be logged in to reply to this topic. Login to reply

Steven993 Mr or Mrs. 500 Points: 588 More actions · Answer 1

Tom.Thomson (1/26/2011)
David Portas (1/22/2010)
Irrespective of that, the Normal Forms, upto and including 6NF don't leave any room for the possibility of nulls. Codd himself says so: "The concepts and rules of functional dependence, multi-valued dependence, and join dependence were all developed without considering missing db-values" ("Missing Informatlon in Relational Databases", 1986).
And he meant what he said, not what you claim what he said meant (which it clearly didn't). What he meant was "It was neccessary to look at these rules again in the light of having nulls". That's been done, and in none of the normal forms which existed by 1986 (10 years after when you say nulls were invented) did the nulls make the slightest difference.

Well, maybe you could explain us how Heath's theorem applys with NULL.

Let's say we have a relation R with attributes {X, Y, Z} which hold the FD X ? Y.

Then R can be decomposed without loss in Ra{X, Y} and Rb{X, Z}.

We can find R with the natural join of Ra and Rb.

Put some NULL in X.

Ra Rb Ra JOIN Rb

X Y X Z X Y Z

------- ------- ---------

x1 y1 x1 z1 x1 y1 z1

NULL y2 NULL z2

No difference ?

TomThomson SSC Guru Points: 104773 More actions · Answer 2

Steven993 (3/7/2011)
Tom.Thomson (1/26/2011)
David Portas (1/22/2010)
Irrespective of that, the Normal Forms, upto and including 6NF don't leave any room for the possibility of nulls. Codd himself says so: "The concepts and rules of functional dependence, multi-valued dependence, and join dependence were all developed without considering missing db-values" ("Missing Informatlon in Relational Databases", 1986).
And he meant what he said, not what you claim what he said meant (which it clearly didn't). What he meant was "It was neccessary to look at these rules again in the light of having nulls". That's been done, and in none of the normal forms which existed by 1986 (10 years after when you say nulls were invented) did the nulls make the slightest difference.
Well, maybe you could explain us how Heath's theorem applys with NULL.
Let's say we have a relation R with attributes {X, Y, Z} which hold the FD X ? Y.
Then R can be decomposed without loss in Ra{X, Y} and Rb{X, Z}.
We can find R with the natural join of Ra and Rb.
Put some NULL in X.
Ra Rb Ra JOIN Rb
X Y X Z X Y Z
------- ------- ---------
x1 y1 x1 z1 x1 y1 z1
NULL y2 NULL z2
No difference ?

From your description above I suspect you have not understood Heath's theorem; the usual statement of the theorem is

A relation R(A,B,C) that satisfies a functional
dependency A ?--> B can always be non-loss decomposed
into its projections R1=?AB(R) and R2=?AC(R).

The premise of Heath's theorem is that a functional dependency A -> B exists, so before I decompose into R into the relations you call Ra and Rb I need that functional dependency to exist; if A is nullable, there can be no such functional dependency, so the premise is false and the theorem makes no assertion about the join of the decomposition (or indeed anything else to do with the decomposition in question) - in other words Heath's theorem continues to hold with no exceptions when nullable columns are allowed.

But Heath's theorem isn't the point that was being discussed anyway, normalisation is, in particular (since we are talking about lossless decomposition as soon as you mention Heath's theorem) 5NF, which is defined to mean that every join dependency (lossless projection pair) of R is implied by the candidate keys of R. So in effect your argument is proposing that a candidate key can have a nullable column in it? Havers, man, sheer nonsense!!!

Tom

Steven993 Mr or Mrs. 500 Points: 588 More actions · Answer 3

Tom.Thomson (3/7/2011)
if A is nullable, there can be no such functional dependency

My demonstration was maybe too short, but this was my point.

NULL impacts nomalization...

Tom.Thomson (3/7/2011)
But Heath's theorem isn't the point that was being discussed anyway, normalisation is, in particular (since we are talking about lossless decomposition as soon as you mention Heath's theorem) 5NF, which is defined to mean that every join dependency (lossless projection pair) of R is implied by the candidate keys of R.

Heath's theorem is fundamental in normalization. It's obvious.

I'd like to point out that in normalization there is no such obligation, or goal to be in 5NF.....

Normalization is an orthogonal subject to Relational Theory.

Every relation is in 1NF, it's always good to normalize (further and further) to reduce or eliminate redondancy and updates anomalies, but there is NO obligation....

Tom.Thomson (3/7/2011)
So in effect your argument is proposing that a candidate key can have a nullable column in it? Havers, man, sheer nonsense!!!

Not at all.. I follow Chris Date, Hugh Darwen, Lex de Haan, Fabian Pascal and others, therefore I don't allow NULL by definition in relations.

What is nonsense to me, is allowing NULL, and moreover allowing it in some attributes and not in others (keys)...... WHY ????

Welsh Corgi SSC Guru Points: 116520 More actions · Answer 4

Good dialogue.

For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Welsh Corgi SSC Guru Points: 116520 More actions · Answer 5

It is customary to have an Address1 and Address2 Line when it is the same entity, for example Address Contains the Street Number and Name and Address two contains the Unit or Apartment number.

If the second address refers to a separate identity, i.e Billing, Mailing Address, etc then it must be a separate entity.

If for example it is a separate order than it is a seperate table and is a separate entity .

Regards,

Welsh

For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Welsh Corgi SSC Guru Points: 116520 More actions · Answer 6

With respect to the Address Table Creation. You create a Mailing Address Table, Billing Address Table, etc...

For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Welsh Corgi SSC Guru Points: 116520 More actions · Answer 7

I do not mean to be sarcastic but how many people have deployed a relational database beyond 3rd normal for? I could see 1 up from that but 3rd normal form but 6th normal form?

What kind of performance do you get?

How did your OLTP perform?

For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Jeff Moden SSC Guru Points: 1003919 More actions · Answer 8

Welsh Corgi (4/23/2011)
It is customary to have an Address1 and Address2 Line when it is the same entity, for example Address Contains the Street Number and Name and Address two contains the Unit or Apartment number.
If the second address refers to a separate identity, i.e Billing, Mailing Address, etc then it must be a separate entity.
If for example it is a separate order than it is a seperate table and is a separate entity .
Regards,
Welsh

Just a second opinion... I'll normally create a single address table and relate it to an address type table through an associative table.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Welsh Corgi SSC Guru Points: 116520 More actions · Answer 9

Jeff,

I have no issues with your address solution but it requires an extra join and it can degrade performance.

If the objective is perfect normalization then go with an extra table but if you want to potentially avoid derogation of performance use an address1 & address 2 lines.

The address 2 line is usually null.

Personally I would not create a second address line for the same type of address unless it improved the execution plan.

Regards,

Welsh

For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Steven993 Mr or Mrs. 500 Points: 588 More actions · Answer 10

I have no issues with you address solution but it requires an extra join and it can degrade performance.
If the objective is perfect normalization then go with an extra table but if you want to potentially avoid derogation of performance use an address1 & address 2 lines.
The address 2 line is usually null.
Personally I would not create a second address line for the same type of address unless it improved the execution plan.

I think you'r off topic.

Why are you talking about performances when we talk about normalization ?

Physical data independence is a fundamental principle in relational theory you should not be ignoring.

Anyway, if you want to talk about performance of a particular logical database design, make a real test, get some metrics and then we can talk about it.

A priori approach of database design is not enough.

Welsh Corgi SSC Guru Points: 116520 More actions · Answer 11

Steven,

The more tables that you join the worse the performance gets.

The reason you want to pull address 1 and address 2 from the same table is because you do not want to degrade performnace by using an extra join and if you are printing lables or an address you do not want the address to be truncated.

How many OLTP (Relational) DB's have you seen where the Address Line 2 was stored in a separate table?

Please let me know...

For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Welsh Corgi SSC Guru Points: 116520 More actions · Answer 12

Why am I talking about performance when you are talking about normalization.

Do you not see a correlation?

For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Welsh Corgi SSC Guru Points: 116520 More actions · Answer 13

Steven993,

You are so funny.

LOL,

Welsh

For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Welsh Corgi SSC Guru Points: 116520 More actions · Answer 14

Why am I talking about normalization as opposed to performance?

For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Welsh Corgi SSC Guru Points: 116520 More actions · Answer 15

I do have an issue with separating address going from one to two into separate table degrades performance.

For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/