SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Third Normal form & some related things


Third Normal form & some related things

Author
Message
Steven.Grzybowski
Steven.Grzybowski
SSCrazy
SSCrazy (2.4K reputation)SSCrazy (2.4K reputation)SSCrazy (2.4K reputation)SSCrazy (2.4K reputation)SSCrazy (2.4K reputation)SSCrazy (2.4K reputation)SSCrazy (2.4K reputation)SSCrazy (2.4K reputation)

Group: General Forum Members
Points: 2427 Visits: 934
With the third normal form, is there any kind of general "rule of thumb" For when you should or should not apply it? I understand that a lot of small tables can cause performance issues, and that it can cause the database to become more complex, but is there any kind of rule about relative sizes of tables that should be split off? Also, for "smaller" fields (numeric values, small varchar fields), at which point does the performance difference between linking to another table vs having the data contained in the same table either become negligible or turn in favor of keeping things in one table?

In a sort of similar vein, at which point does it make more sense to use a view rather than a table for something? While it is possible to use a view to pull up data, and having redundant data is bad, is it justifiable to have redundant data for speed purposes? For example, the table that would have redundant data in it is only a couple hundred MB, while the table that it could be joined to on a view is a few hundred GB, does it make sense to create redundant data, but in a smaller table for something that gets accessed frequently? For example - Table 1 (100,000 rows, 5 columns 3 are redundant data ) vs
Table1 joined to table 2 (100,000,000 rows, 20 columns). While the building of table 1 and adding in of the redundant fields takes time, the joining to table2 takes place one time, vs having to join to table 2 every time somebody wants to look at the report.
Tom Thomson
Tom Thomson
SSC Guru
SSC Guru (94K reputation)SSC Guru (94K reputation)SSC Guru (94K reputation)SSC Guru (94K reputation)SSC Guru (94K reputation)SSC Guru (94K reputation)SSC Guru (94K reputation)SSC Guru (94K reputation)

Group: General Forum Members
Points: 94290 Visits: 13645
The rule of thumb is that you always go to 3rd Normal Form (or preferably to Elementary Key Normal Form) and then look at whether you need to go to a higher normal form. But that's because you want to be sure that you don't have buggy or overcomplicated code, not a performance issue.
And usually, going to 3NF (or EKNF) instead of staying at some lower normal form will improve performance as well as reducing code complexity, because it will reduce the size of your data.
As a general rule, any redundancy in data costs both performance and code complexity so that you end up with buggy and non-performant code. But there may be cases where redundancy can be justified - but you need to verify the justification by seeing what actually happens when you normalise the redundancy out.
Obviously if you build a data warehouse which doesn't permit any update it can safely be in a much lower normal form (hence more redundancy so greater data size than if it were properly normalised) and it will be worth doing some performance tests to see whether the normalised or the unnormalised form gives better performance in this no update case.

Tom

Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (884K reputation)SSC Guru (884K reputation)SSC Guru (884K reputation)SSC Guru (884K reputation)SSC Guru (884K reputation)SSC Guru (884K reputation)SSC Guru (884K reputation)SSC Guru (884K reputation)

Group: General Forum Members
Points: 884729 Visits: 47951
Steven.Grzybowski (8/11/2016)
With the third normal form, is there any kind of general "rule of thumb" For when you should or should not apply it? I understand that a lot of small tables can cause performance issues, and that it can cause the database to become more complex, but is there any kind of rule about relative sizes of tables that should be split off? Also, for "smaller" fields (numeric values, small varchar fields), at which point does the performance difference between linking to another table vs having the data contained in the same table either become negligible or turn in favor of keeping things in one table?

In a sort of similar vein, at which point does it make more sense to use a view rather than a table for something? While it is possible to use a view to pull up data, and having redundant data is bad, is it justifiable to have redundant data for speed purposes? For example, the table that would have redundant data in it is only a couple hundred MB, while the table that it could be joined to on a view is a few hundred GB, does it make sense to create redundant data, but in a smaller table for something that gets accessed frequently? For example - Table 1 (100,000 rows, 5 columns 3 are redundant data ) vs
Table1 joined to table 2 (100,000,000 rows, 20 columns). While the building of table 1 and adding in of the redundant fields takes time, the joining to table2 takes place one time, vs having to join to table 2 every time somebody wants to look at the report.


Do the two tables have a 1:1 relationship? If so, then the tables may be improperly designed. If not, then you need to put up with the join because, as Tom states, it's going to be a whole lot easier to maintain and, because of the reduced column width, could actually be faster that one huge denormalized table.

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum








































































































































































SQLServerCentral


Search