Cocktail Recipe Database Schema - Tables Question - Normalization

  • I'm fairly new to database design. I have done a few in the past, however, I am trying to normalize my current database, and may be overthinking it, or not thinking enough.

    In the current design I have an ingredients table which holds all ingredients. I am wondering if I should separate this table into many based upon ingredient type. For instance, "spriits" table, Juice table, etc... I am wondering if I leave all ingredients in one table. or will this make for inefficient queries in the future when table becomes large. I want to be able to search by, ingredients, or recipe name, and in future by shots/category etc..

    I am assuming the more table the more efficient, and expandable a database for unknown future uses. Am I thinning correctly. Attached is my current EDR diagram.

    [/img]

  • Given what I imagine is limited data in your ingredients table I would not break the table up for performance reasons. When I say limited I am assuming you have maybe thousands of records?

    I wouldn't say that more tables is more efficient. I would say the correct number of tables is most efficient and from you diagram you look to be in good shape.

    You could always categorize your ingredients (fruits, spirits, etc) either within the ingredients table or in a separate schema.

  • Thank you very much for the answer.

    If I was to create another schema wouldn't that be separating the ingredients into their own tables? For instance, Vodka, Rum => 'Sprits', and orange, mango =>'Juices'. If i tried to categorize them into the ingredients table, I would end up with redundancies . I imagine I would need to create an ingredientType table then use foreign key in ingredients table to avoid any redundancies?

    Funny thing when one starts over-thinking relations confusion starts to set in. I will leave the schema alone, and finish. I can alway add this later, but my thinking is it would be less of an 'update' pain if I did this now. However, I am unsure if having more tables, as you stated, would be more efficient, or better for possible future uses. As of now I can only think of two ways one would search for a cocktail; by recipe name, or by ingredients on-hand. I thought that some may actually try to find all 'Spirits' that contain 40% alcohol, or maybe organic juices, but this is probably way overthinking the logic.

    Thank you again, I appreciate your perspective, and advice. 🙂

    -M-

  • Actually normalization (more tables) causes slower performance. That is the nature of the beast. However, in a relational database such as this you want to normalize. When designing your data don't think about how easy it is query and even worse is to change your schema based on performance. This is known as pre-optimization and it will cause you untold amounts of pain in the future.

    I would suggest that you have an ingredient table and each ingredient can be of a type (juice, spirit, malt beverage, etc). Then create an IngredientType table to hold the string value.

    I would avoid the use of sql reserved words like 'name'. Remember that there is nothing wrong with using some descriptive text with object and column names. It will make your life a lot easier in the long run if you use more descriptive names.

    One other recommendation is to check your spelling. Category is misspelled. This may not seem like a big deal but common words spelled incorrectly in a database are not fun to work with. Measurement is also misspelled.

    Your Category table has 2 columns I don't quite understand (martini and non-alchoholic). Those seem like rows instead of columns.

    I am not quite sure what the quantity table is. It looks like a 1:1 with recipes?

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Thank you for your feedback. I am not done with the database, and the ERD has some tables that may be deleted, or are thoughts at this point.

    I see the misspelled words (I'm horrible) I will take your advice and use more descriptive names. I was overthinking the idea. Measurements is going to hold "tablespoon" aka 'Tb', etc.. This is mostly for the form to submit keeping different spellings, or abbreviations out of the recipes. Still working on overall layout.

    I have much work to do and will take your suggestions (recommendations) to heart. The category table is just a thought, I may use may not. I want to possible organize drinks by category for simplicity because there are many types of martini's, amount others. Was just thinking if the three tables Ingredients, Recipes, and drinks were 3NF. I have a horrible habit of overanalyzing.

    I will use the links you provided to further develop my understanding.

    Thank you again.:-)

  • Sean Lange (7/29/2014)


    Actually normalization (more tables) causes slower performance.

    "It Depends". 😉

    If you have de-normalized table that looks like a spreadsheet with date columns in it, queries can be both difficult to write and very slow in performance. "Normalizing" such a table to be more like an EAV table will allow for proper indexing, easy non-reporting style lookups-aggregations-etc, and still allow for some very high performance CrossTabs and Pivots for reporting purposes while also making it easier to create "moving window" reports.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Sean Lange (7/29/2014)


    Actually normalization (more tables) causes slower performance. That is the nature of the beast. However, in a relational database such as this you want to normalize. When designing your data don't think about how easy it is query and even worse is to change your schema based on performance. This is known as pre-optimization and it will cause you untold amounts of pain in the future.

    My experience with reasonably straightforward databases is that nomalisation to 3NF improves performance because it reduces the data size significantly. Of course if you end up needing to join 5o tables the performance will be awful, but tends to happen when you normalise to Date's 6NF and not when you normalise to 3NF.

    I would suggest that you have an ingredient table and each ingredient can be of a type (juice, spirit, malt beverage, etc). Then create an IngredientType table to hold the string value.

    That looks like the right way to do it to me.

    I would avoid the use of sql reserved words like 'name'. Remember that there is nothing wrong with using some descriptive text with object and column names. It will make your life a lot easier in the long run if you use more descriptive names.

    One other recommendation is to check your spelling. Category is misspelled. This may not seem like a big deal but common words spelled incorrectly in a database are not fun to work with. Measurement is also misspelled.

    I agree with those comments.

    Your Category table has 2 columns I don't quite understand (martini and non-alchoholic). Those seem like rows instead of columns.

    I am not quite sure what the quantity table is. It looks like a 1:1 with recipes?

    I don't understand those bits either.

    Tom

  • Sean Lange (7/29/2014)


    Actually normalization (more tables) causes slower performance.

    I want to let this go, but I just can't.

    No. Not true.

    Everything, everything, within a structured storage engine comes with trade-offs and costs. Nothing is free. TANSTAAFL always applies. But a properly normalized structure with appropriate indexes and enforced referential constraints generally outperformers a flat, denormalized structure. This is because the relational engine is optimized around supporting the relational storage and retrieval of data. Can we identify situations where this is not true? Absolutely. But by and large the normalized structure is superior for the types of applications and needs that structured, relational storage is built for.

    We can go down the road of talking about id/value pairs of unstructured storage for data collection. But, they don't support querying the way relational storage does. So, different requirements, different structures, different performance paradigms. So, if we're talking about different applications and different needs, then, yes, relational storage falls off a cliff and normalization hurts performance. But then, we shouldn't be using a referential engine for the storage anyway.

    Sorry, so sorry, just saw this statement one too many times this month.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Grant Fritchey (7/30/2014)


    Sean Lange (7/29/2014)


    Actually normalization (more tables) causes slower performance.

    I want to let this go, but I just can't.

    No. Not true.

    Everything, everything, within a structured storage engine comes with trade-offs and costs. Nothing is free. TANSTAAFL always applies. But a properly normalized structure with appropriate indexes and enforced referential constraints generally outperformers a flat, denormalized structure. This is because the relational engine is optimized around supporting the relational storage and retrieval of data. Can we identify situations where this is not true? Absolutely. But by and large the normalized structure is superior for the types of applications and needs that structured, relational storage is built for.

    We can go down the road of talking about id/value pairs of unstructured storage for data collection. But, they don't support querying the way relational storage does. So, different requirements, different structures, different performance paradigms. So, if we're talking about different applications and different needs, then, yes, relational storage falls off a cliff and normalization hurts performance. But then, we shouldn't be using a referential engine for the storage anyway.

    Sorry, so sorry, just saw this statement one too many times this month.

    Oh how I regret not making my original comment less absolute. :w00t:

    I would never suggest we denormalize our structures in most normal situations for many reasons.

    And certainly no need to apologize for calling me to task when I post something so blatantly incorrect.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Sean Lange (7/30/2014)


    Grant Fritchey (7/30/2014)


    Oh how I regret not making my original comment less absolute. :w00t:

    I would never suggest we denormalize our structures in most normal situations for many reasons.

    And certainly no need to apologize for calling me to task when I post something so blatantly incorrect.

    Ha! Again, apologies. I know I went off on a rant and that was wrong. Maybe I need to step away from the forums for a little while.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Grant Fritchey (7/30/2014)


    Sean Lange (7/30/2014)


    Grant Fritchey (7/30/2014)


    Oh how I regret not making my original comment less absolute. :w00t:

    I would never suggest we denormalize our structures in most normal situations for many reasons.

    And certainly no need to apologize for calling me to task when I post something so blatantly incorrect.

    Ha! Again, apologies. I know I went off on a rant and that was wrong. Maybe I need to step away from the forums for a little while.

    Not at all. My comment was flat out incorrect as stated. I have no problem when somebody tells me I am wrong. Nothing you said was snarky or personal at all. You pointed out I was wrong and backed it up with facts. Jeff and Tom also made similar comments about my statement. Keep them coming Grant. We learn more from our mistakes than our successes. 😀

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Think about how you may want to report on or search your data when you consider your categorization, things can go from simple to complicated real fast.

    Simple-One category per ingredient. It can be difficult to fit a messy world into a rigid scheme. Is port a wine or a spirit?

    Tag Words-You can associate multiple words with a ingredient. Flexible for searching but can get out of control.

    Hierarchical-You would need strict data entry and querying becomes more complicated.

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply