What is Normalizations?

  • I need to Know ,

    What is normalizations and type of normalizations?

    With Some examples...

    Please any one help me...

  • Here's a place to start... http://en.wikipedia.org/wiki/Database_normalization

    After that, use your search engine of choice (Google, Yahoo....) to find info on E.F. Codd, "Normal Form", "Data Modeling", and other terms in the Wiki article.

  • You might want to pick up a book on the topic. It's as old as the hills in computer terms, but Flemming & von Halle's Handbook of Relational Design is still one of the best fundamental texts out there.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • [font="Verdana"]Here's something I wrote on the topic a while ago (note the Anglicised spelling versus the Americanised... just means we use an S instead of a Z.)

    Normalisation is the process of pulling out repeated data and data sets and storing them individually, rather than storing multiple copies.

    A pragmatic approach should be taken to normalisation. In some cases (particularly with decision support and reporting systems) it is important to denormalise for query performance. For transaction systems, it is important to normalise as much as practical for transaction throughput performance.

    For this reason you need to ensure that you understand how the database you are designing will work. If you have conflicting requirements (high transactions, extensive reporting) then separate out the data so that you can normalise the transactional data and denormalise the reporting data.

    Whatever your approach, be prepared to justify it in a design review. Documenting the decisions you make on why you have normalised or denormalised the data is a great idea.

    First Normal Form

    Relational database theory requires all records to have the same “shape”, i.e. not a variable number of columns or tables or fields. First normal form is a standard list of columns that make up your data.

    If you have optional data, list out all of the optional data possibilities. This could be broken into a list of rows with the same columns, or a related list of columns.

    Second Normal Form

    Where you have composite business keys (i.e. made up of multiple fields), you may have some data that is related to a sub-set of those fields. In such a case, you should split the data up into two entities.

    For example, if you have a composite key on (warehouse, part) and you have a field that is the warehouse address, and another field that is quantity, that would require two entities: one to contain (warehouse, address) and the other to contain (warehouse, part, quantity.)

    Third Normal Form

    Third normal form requires that all of the fields in an entity are directly related to the key, and that there isn’t a case where a field in an entity is related to another field in the same entity that is not a key.

    For example, if you have an entity (employee, department, location) and there is a relationship between department and location, this requires two entities: one to contain (employee, department) and another to contain (department, location).

    Fourth Normal Form

    This says that an entity should not mix data about independent relationships.

    For example, if we have employees with a skill and a language, skill and language are independent. So there is no actual direct relationship between skill and language. That means we should define an entity to contain (employee, skill) and another to contain (employee, language) rather than try to put those relationships into the same entity.

    This gets complicated when there is a relationship. For example, if one of the skills is cooking, and it has a relationship to the language (French, Greek, Italian, Chinese.) This is a poor example, because the actual relationship is between the style of cooking and the ethnicity and country of origin. But all of the recipe books may be written in French! In this case, it would not be a violation of fourth normal form to have one entity that links (employee, skill, language) because there is actually such a dependency.

    So modelling independent facts becomes a little complex when there are some interdependencies. You need to establish carefully what the dependencies actually are.

    Fifth Normal Form

    Fifth normal form involves reconstructing larger facts (or relationships) from smaller facts and relationships.

    For example, if we have agents who work for companies and who sell products. We could represent that as one entity containing (agent, company, product). However, that means we would repeat the information about the agent and company for every product that the agent can sell.

    It would be better to have several related entities: one for (agent, company) and one for (company, product). There is probably a third which lists the products that an agent can supply (since they can’t necessarily supply everything that an entire company does), and this would contain (agent, product).

    Beyond Fifth Normal Form

    Yes, it is possible! You can remove things like redundant composite keys, supply shorted surrogate keys for larger business keys and the like. In fact, the design techniques we encourage do exactly that. Arguably, this can be seen as a case of applying the fifth normal form. But the normal forms are defined to look just within one entity, not across multiple entities.

    [/font]

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply