• Smendle (7/14/2011)


    Great article, didnt read any previous articles before (assuming you started at 1). It should be noted that with any production DB system, even one that seems simple, changing it to conform to DB Normalization rules will create a lot of other work like fixing queries against said DB, reports will be broken, space will increase etc...

    That's why you should try to do all the normalisation while designing the schema, before writing code for queries and reports.

    Peronally I recommend that when you have a rough schema design you list all the business rules that you would like the schema to enforce for you without requiring code to do it, express those rules as functional dependencies where you can do so, and then use Bernsteins algorithm to derive a schema which conforms to 3NF (actually it will conform to EKNF, which is better than 3NF, even though Bernstein was only targetting 3NF when he designed the algorithm). Check and make sure that all business rules are now covered by key constraints (which will involve adding NOT NULL constraints, UNIQUE constraints, and Foreign Key constraints as needed - it may also involve adding some auxiliary tables if any of the weird cases for which BCNF was intended exist: this is usually a better bet than actually going to BCNF, since in those cases BCNF can't enforce all the functional dependencies, while EKNF can if you add the auxiliary tables). Then look at any business rules which couldn't be expressed as functional dependencies: if you didn't have to add auxiliary tables to enforce FDs without going to BCNF, you can normalise to 4NF to get a schema which enforces any business rules expressed as multivalued dependencies as well as those expressed as functional dependencies; or if you are lucky you may find that the multi-valued dependencies have no impact on the bits of the schema that needed the auxiliary tables, so you can enforce the multi-valued dependencies by taking the other parts of the schema to 4NF (there is no easy way like Bernstein's algorithm to get to 4NF - or even to BCNF - so this is a bit of a slog). At worst you will end up with an EKNF schema with bits of it in 4NF and some bits not in 4NF, and there the only advice I can offer is to examine each unenforced multi-valued dependency very carefully and decide whether to enforce it (taking another part of the schema to 4NF but losing some enforcement of simple FDs) or not - either way you will need code (either a good stored procedure-based interface with no direct access from the application to tables and views, only to SPs, or a set of triggers) to enforce the business rules that the key constraints can't enforce (obviously check constraints can help sometimes, but they can't do the whole job); then look and see if any of your tables have any non-trivial join dependencies, and if so are they implied by the candidate keys or not; if there are any not implied by the candidate keys, you may want to split those tables (ie go to 5NF) - but as with the MVDs, there can be a trade-off here, since the split may lose enforcement of FDs (and you may find that the number of tables involved in joins is getting too big, potentially making the code hard to understand). If you want to have a lot of time-dependent information - lots of things have values associated with a period in time, and you want to retain all the history - you may want to elimainate all non-trivial dependencies (go to 6NF) but I can't offer any advice there, it's not within my field of competence.

    Anyway, once you've done all that hard work of schema design, always being sure to allow as much flexibility as possible since the business rules may change next week (or next year, or whenever) you will find it's enormously easier to write simple and bug-free code than if you hadn't done it. If you just throw something together without any thought for normalisation you will find that you get some very complex code, and probably find it hard to maintain, hard to emhance, and discover that you need to normalise to survive - but it is now much harder that it was up front.

    Tom