Help with Normalization

  • I am trying to normalize a table that I need for a new system I am putting together. What is the best way to normalize this table (since it has multiple weight fields). Here is a small section of the table layout.

    ProductId

    ProductName

    ProductDescription

    Weight

    Height

    Length

    Depth

    CaseQuantity

    CaseWeight

    CaseHeight

    CaseLength

    CaseDepth

    AssembledWeight

    AssembledHeight

    AssembledLength

    AssembledDepth

    I was first inclined to use the EAV model but I would prefer not to. Could I please get some input on a good way to normalize this type of structure.

    Thank You In Advance !

  • Rules for Normalization

    First Normal Form (1NF): There are four tasks to accomplish to achieve 1NF.

    - Eliminate unnecessary data.

    - Group the data to describe one thing.

    - Assign the key(s) to the table.

    - Eliminate repeating groups.

    Second Normal Form (2NF): Next...

    - The data must be in 1NF.

    - All non-key columns in each table are dependent on the whole key.

    Third Normal Form (3NF): This is where you want to be before you even think about de-normalizing your database.

    - The data must be in 2NF.

    - No columns are dependent on non-key columns.

    - There are no derived (calculated) values.

    The saying goes "Normalize until it hurts...De-normalize until it works", and there is a lot of truth and wisdom in this statement.

    I would also recommend reading anything you can find from Michelle Poolet. She is a highly regarded and recognized master archetect in the community, and is published out on SQL Server Magazine. Here is a link with several other related articles by her, that will definitely help you out.

    http://www.sqlmag.com/article/database-administration/database-design-essentials-129493

    I just finished a brand new db design that was painful because I did not start off this way. Good luck and ask lots of questions!

    🙂

  • I'd break out the Height, Length, Depth, Weight measures into... probably, four different tables, each with a type lookup into a lookup table. I'd consider making a single measure table, but I'm not sure about all the business requirements, so it's not my first choice with this small amount of information.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • could you define the columns e.g is product a subset of case is a subset of assembled?

    Like Grant said business requirements will define the relationship between the columns and as a result the degree of normalization.

    Jayanth Kurup[/url]

  • Compusol (6/20/2011)


    I am trying to normalize a table that I need for a new system I am putting together. What is the best way to normalize this table (since it has multiple weight fields). Here is a small section of the table layout.

    ProductId

    ProductName

    ProductDescription

    Weight

    Height

    Length

    Depth

    CaseQuantity

    CaseWeight

    CaseHeight

    CaseLength

    CaseDepth

    AssembledWeight

    AssembledHeight

    AssembledLength

    AssembledDepth

    I was first inclined to use the EAV model but I would prefer not to. Could I please get some input on a good way to normalize this type of structure.

    Please correct me if I"m wrong...

    1- List of columns appear to describe a single, unique *entity* which is PRODUCT.

    2- List of columns does not suggests *weight fields* change over time or version and, if they do list of columns appears to suggest business requirements are not calling for tracking such changes.

    IF #1=TRUE and #2=TRUE then model a single PROUCTS table.

    IF #1=TRUE and #2=FALSE then model PRODUCT table and a PRODUCT_WEIGHT table grouping there the items which changes are tracked over time or version - add either timestamp or version column depending on strategy.

    IF #1=FALSE please clarify further.

    Hope this helps.

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • First I would like to thank all of you for your input. Initially I was going to give more information but I thought by keeping it simple it would make it easier for you to compile a response. I guess seeing the answers I was wrong. Here is more information as you all have requested.

    1.) This request has stemmed from a desire to normalize a table that has 141 fields (VERY non-normalized). The fields above are only some of the "problems" this table has. Just trying to get it to first normal form has been a journey.

    2.) The business rules are a bit shaky as they really do not know what they want (they being upper management). So basically I am trying to make the design be as flexible as possible without killing myself. 🙂

    3.) Answers to Jayanth's questions:

    a. Some of the products come with multiple pieces in a case pack

    b. The Assembled concept is based upon the contents of the box as needing to be assembled (like a bicycle you would get from toys-r-us- all where all the parts are in the box, so the box is considered a single product, that needs to be assembled) Business rules do not allow for a productId for assembled / unassembled ( I am trying to make them see the need for this).

    4.) Answers to Paul's questions:

    a. Yes the columns do describe a single unique entity - product.

    b. Based upon the answer to 1 above I have to think that there could be a possibility of the values changing and "possibly" needing the ability to store history of the changes they make.

    I believe that one of the problems involved here is that fact that the entity has many "attributes" (height, weight) etc one would have a tendency to create a EAV (actually one of the developers here wants to do that) but I know about the baggage that can possible come with that decision. I am trying to create a Normalized Relational Model that will alow as much flexibility as possible without creating a monster. My first consideration was to use Grant's method but was not 100% sure it would return the best model (I am thinking a little differently now). With that, I was just curious how others would model something like this, hence the initial question.

    Thank you again for the help !

    CompuSol

  • Compusol (6/21/2011)


    First I would like to thank all of you for your input. Initially I was going to give more information but I thought by keeping it simple it would make it easier for you to compile a response. I guess seeing the answers I was wrong. Here is more information as you all have requested.

    1.) This request has stemmed from a desire to normalize a table that has 141 fields (VERY non-normalized). The fields above are only some of the "problems" this table has. Just trying to get it to first normal form has been a journey.

    2.) The business rules are a bit shaky as they really do not know what they want (they being upper management). So basically I am trying to make the design be as flexible as possible without killing myself. 🙂

    3.) Answers to Jayanth's questions:

    a. Some of the products come with multiple pieces in a case pack

    b. The Assembled concept is based upon the contents of the box as needing to be assembled (like a bicycle you would get from toys-r-us- all where all the parts are in the box, so the box is considered a single product, that needs to be assembled) Business rules do not allow for a productId for assembled / unassembled ( I am trying to make them see the need for this).

    4.) Answers to Paul's questions:

    a. Yes the columns do describe a single unique entity - product.

    b. Based upon the answer to 1 above I have to think that there could be a possibility of the values changing and "possibly" needing the ability to store history of the changes they make.

    I believe that one of the problems involved here is that fact that the entity has many "attributes" (height, weight) etc one would have a tendency to create a EAV (actually one of the developers here wants to do that) but I know about the baggage that can possible come with that decision. I am trying to create a Normalized Relational Model that will alow as much flexibility as possible without creating a monster. My first consideration was to use Grant's method but was not 100% sure it would return the best model (I am thinking a little differently now). With that, I was just curious how others would model something like this, hence the initial question.

    Thank you again for the help !

    CompuSol

    Like I said, I wasn't sure I'd want to go EAV on it. Those things can be a messy bear to maintain. But, you might want to look into some of the different Bill Of Materials models that are out there. That might be a satisifying approach now that you've outlined a bit more of what's needed.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Grant,

    The issue that I was really trying to figure out was the best way to normalize the Height, weight, length, etc. In your opinion would it be best to have 3 different tables i.e. Prod_measure, Prod_CaseMeasure, Prod_AssemMeasure and then have 3 fields in the products table that is a FK to the PK in each of the measure tables ?

    Something like this:

    Prod_Measure

    prodMeasureId

    weight

    height

    length

    width

    Prod_CaseMeasure

    prodCaseMeasureId

    weight

    height

    length

    width

    Prod_AssemMeasure

    prodAssemMeasureId

    weight

    height

    length

    width

    Then have MeasureId, CaseMeasureId, and AssemMeasureId fields in the product table ?

    One of my concerns is this... There could be a strong possibility that each productId could have 1 record in each table. Or maybe worse not have ANY records in one of the tables. I am not 100% certain that it really makes a difference but if there is a chance that they would all have records in each table, why not just put the fields into the Product table ? I think I know the answer to this but I am just looking for affirmation.

  • No, I would have gone a different way. A heigh table, a width table, etc., and then a Type table that can be mapped to each of the other tables to break them up by Prod, Case, Assembly... that way you can add other types of height measure without having to add tables. It's moving toward an EAV approach, but not going all the way.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Grant,

    Sorry but I must be missing something. My initial design sounds just like what you are describing but from what I did I would consider it to be a EAV. Here is what I had:

    Product

    ProductId

    ProdDescription

    MeasureType

    MeasureTypeId

    MeasureType

    ProductMeasurement

    ProductMeasurementId

    ProductId

    MeasurementTypeId

    HeightMeasurement

    HeightMeasurementId

    HeightMeasurement

    LengthMeasurement

    LengthMeasurementId

    LengthMeasurement

    WidthMeasurement

    WidthMeasurementId

    WidthMeasurement

    Based upon this I was only able to make it work by creating the EAV.

    Thank You

    CompuSol

  • No, you could go one more level deep. Have three tables. One for the measure, one for the measure type (height, width, weight), and one for the business measure type (prod, etc.). That's what I thought you meant by EAV. Normalize right down to nothing. I just find those types of designs difficult to maintain and, depending on the distribution of the data, really poor performers. Although, with SQL Server 2008 filtered indexes, you might be able to avoid that issue.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • I would target this normalization very differently. Instead of going all the way to having an ID for a length amount, I'd expect some standardization (cm for example) of length/width/height, and then break out the table like the following:

    Product

    ProductID

    ProductName

    Desc

    QuantityPerCase (Default 1)

    ProductDimensions

    ProductID

    DimensionType (Case, Assembled, etc)

    Weight (standardized, ounces, pounds, whatever, math is quick later on presentation)

    Height

    Length

    Depth


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Grant,

    Yes, ok.. now I see what you are getting at and yes I agree with you, it can get a bit slow. I understand the need for normalization and the fact that sometimes being too normal is not good (although my wife says I should act more normal but that is another conversation). I think I am going to step back a few levels and see what I can come up with.

    Thank you very much for helping me on this.

    Compusol

  • Craig,

    Thank you for the input. I used your idea with a few small twists. Looks good for what I need.

    Thank You !

    CompuSol

Viewing 14 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply