Implementation of Normalization in SQL Server

  • Comments posted to this topic are about the item Implementation of Normalization in SQL Server

  • Since you are illustrating your process using a physical database, I think it would be instructive to include primary and foreign keys on your normalized tables to enforce your normalization.  Also, some queries joining your normalized tables could be presented to show that the data can be returned to look like the original, denormalized table.

    Using surrogate keys to present normalization is kind of awkward since normalization rules apply to natural keys.

  • hmbacon wrote:

    Using surrogate keys to present normalization is kind of awkward since normalization rules apply to natural keys.

    I wouldn't mind an illustration why tables with surrogate keys can't be subjected to normalization rules. Would be interesting to see, hoping for a reply on this!

     

  • Good article.  One critique.  You use of abbreviations in the object names made this difficult to follow.  dname?

    Michael L John
    If you assassinate a DBA, would you pull a trigger?
    To properly post on a forum:
    http://www.sqlservercentral.com/articles/61537/

  • I am sorry for the late response and don't know if you are still looking for an example.  By definition, a critical part of conceptual modeling is identifying the unique identifier (UID) of an entity. “The data depends on the key, the whole key and nothing but the key, so help me Codd. ( By the way, E. F. Codd defined the original 12 Rules of Normalization.)  If you don't identify the key, or UID, you are not performing conceptual modeling.

    A surrogate is something that stands in for something else.  Surrogate key columns are part of the relational model and stand in for the natural key(s) from the conceptual mode but do not replace the UID.

    Even if you don't agree that surrogate keys are not conceptual modelling artifacts, maybe this example can explain why a surrogate key is great for performance and saving space, but not for a UID.  Imagine a parts bin system for building cranes.  One of the entities is a representation of the organization of parts bins and any given bin is uniquely identified by the building number, room letter, cabinet number, drawer letter and slot number.  These 5 attributes are the UID or natural key.  These appear to be surrogate keys, and they are, but they are physically attached so someone can go to building 1234 and know they are in the right place.  To improve performance and to save space in the database, a surrogate key has been created for the bin table.  One of the influential developers worked on a project at his last job where only surrogate keys were used and no unique keys were created for the UID.  While the bin table had columns for building number, room letter, cabinet number, drawer letter and slot number, a unique key was not created for the UID.  The developer that convinced everyone to only use surrogate keys is working on the module to create the records for the bin locations.  He doesn't do anything to verify if a row already exists for a particular building number, room letter, cabinet number, drawer letter and slot number because he doesn't think users will do that.  His module will not stop a user from creating the same bin twice and that user, that can get the app to do things that were never intended, creates a new row with the same building number, room letter, cabinet number, drawer letter and slot number.  A row with a new unique surrogate key is created but the UID is now duplicated.  The page that shows a list of bin locations blows up in testing because too many rows are returned and someone "fixes" the issue by adding a TOP 1 to the query and all seems to work.  Then someone needs to add an ORDER BY clause to the query for that page for some other requirement and now the TOP 1 returns a row with a different surrogate key.  That bin with the new surrogate key shows that there are no parts in it so someone assigns another part to that bin.  Now the perfect storm occurs.  The stocking clerk puts the new part into the same bin and the different parts are the same except the have different strengths.  Bolts are picked from the bin, some sufficiently strong and others are not.  A crane is built and the bad bolts sheer off under load and 6 people are killed because the UID was not considered important.

    Could the code have been written better to prevent the duplicate.  Of course it could.  But in the cases where something is missed,  it is good to have the safety net of creating a Unique Key for the UID/natural key in the database to prevent the duplicates.  The database shouldn't be responsible for validations but it should be able to reveal code that can cause problems.

    I bet some people have seen duplicate problems like this where unique keys were not created on natural keys and only surrogate keys were used as primary keys.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply