Pseudo code help

  • Need a little guidance regarding a personal project. I don't want any actual code, just pointers in the right direction.

    There are two tables: recipes and ingredients
    RecipeID
    Recipe
    Choose
    ---------------------
    IngredientID
    Ingredient
    ----------------------

    Recipes are given as a whole (ingredients and instructions are in the same field/column). The Ingredients table holds a list of ingredients available for use (you can assume that there is at least one recipe with an available ingredient). 

    What would be the most efficient way to 'Choose' the smallest total number of recipes using all avail ingredients?

    Pseudo code so far is:
    1.  For each ingredient, count instances found in recipe and store that value somewhere.
    2.  For each recipe, count available ingredients and store value somewhere.
    3.  Find recipe with highest number of ingredients and 'Choose'.
    Here is where my logic seems to go off track...
    Once a recipe is CHOSEN, then only need to look for recipes that include the other ingredients. It is perfectly acceptable to have the same ingredient in every recipe as long as all ingredients are being used.

    Would love to see your approach to solve! Functions vs cursors vs other creative ideas.

    Thanks!!

  • IMTM2 - Saturday, December 1, 2018 2:32 PM

    Need a little guidance regarding a personal project. I don't want any actual code, just pointers in the right direction.

    There are two tables: recipes and ingredients
    RecipeID
    Recipe
    Choose
    ---------------------
    IngredientID
    Ingredient
    ----------------------

    Recipes are given as a whole (ingredients and instructions are in the same field/column). The Ingredients table holds a list of ingredients available for use (you can assume that there is at least one recipe with an available ingredient). 

    What would be the most efficient way to 'Choose' the smallest total number of recipes using all avail ingredients?

    Pseudo code so far is:
    1.  For each ingredient, count instances found in recipe and store that value somewhere.
    2.  For each recipe, count available ingredients and store value somewhere.
    3.  Find recipe with highest number of ingredients and 'Choose'.
    Here is where my logic seems to go off track...
    Once a recipe is CHOSEN, then only need to look for recipes that include the other ingredients. It is perfectly acceptable to have the same ingredient in every recipe as long as all ingredients are being used.

    Would love to see your approach to solve! Functions vs cursors vs other creative ideas.

    Thanks!!

    I'd suggest that you need at least one additional table, call it RecipeIngredient or similar, containing (RecipeId, IngredientId) as its primary key.. This resolves the many-to-many join between recipes and ingredients.

    Once you have this available, your queries become easy to formulate.

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • Using your proposed algorithm you won't necessarily find the minimum number of recipes that use all ingredients. It might be a good approach to finding a good low value of recipes but if you want the actual minimum set you need another approach.
    A simple demonstration is with these recipes:
    Recipie1, Ingredients: A, B, C, D
    Recipie2, Ingredients: A, B, E
    Recipie3, Ingredients: C, D, F
    You would end up choosing all three recipes when just 2 and 3 would do.

  • Jonathan AC Roberts - Sunday, December 2, 2018 10:42 AM

    Using your proposed algorithm you won't necessarily find the minimum number of recipes that use all ingredients. It might be a good approach to finding a good low value of recipes but if you want the actual minimum set you need another approach.
    A simple demonstration is with these recipes:
    Recipie1, Ingredients: A, B, C, D
    Recipie2, Ingredients: A, B, E
    Recipie3, Ingredients: C, D, F
    You would end up choosing all three recipes when just 2 and 3 would do.

    To be honest, the following requirement does not make sense as it stands:

    What would be the most efficient way to 'Choose' the smallest total number of recipes using all avail ingredients?

    because the answer to the question: "How many recipes use all available ingredients?" is a single number ... so it's not a case of smallest or largest.

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • Phil Parkin - Sunday, December 2, 2018 10:59 AM

    Jonathan AC Roberts - Sunday, December 2, 2018 10:42 AM

    Using your proposed algorithm you won't necessarily find the minimum number of recipes that use all ingredients. It might be a good approach to finding a good low value of recipes but if you want the actual minimum set you need another approach.
    A simple demonstration is with these recipes:
    Recipie1, Ingredients: A, B, C, D
    Recipie2, Ingredients: A, B, E
    Recipie3, Ingredients: C, D, F
    You would end up choosing all three recipes when just 2 and 3 would do.

    To be honest, the following requirement does not make sense as it stands:

    What would be the most efficient way to 'Choose' the smallest total number of recipes using all avail ingredients?

    because the answer to the question: "How many recipes use all available ingredients?" is a single number ... so it's not a case of smallest or largest.

    I think you're concentrating on the semantics and not the pragmatics. From the other things mentioned in the initial post about the proposed approach I think the OP means:
    What would be the most efficient way to choose the smallest number of recipes that together use all avail ingredients (with overlaps allowed)?
    I think this is an NP-hard problem (so not easy to solve efficiently)

  • Thank you, Phil and Jonathan, for your input!
    Jonathan, You are 100% correct! Using your example, I would only want to select Recipe 2 and 3 since they do include all avail ingredients (and yes, overlaps are allowed). I thought that my wording of the question wasn't terribly clear but you seem to have no problem understanding the end goal.  Your phrasing is better: What would be the most efficient way to choose the smallest number of recipes that together use all avail ingredients (with overlaps allowed).

    It feels like a question of which comes first: chicken or egg?

    So far, I do have a join table containing the RecipeID and IngredientID and I am able to get their counts.
    First step, a stored procedure 'chooses' the Recipe that has the highest number of avail Ingredients and from this, the avail ingredients are 'chosen'.
    Once the ingredients have been "chosen", they aren't looked for again. (Begin Loop?)
    The next Ingredient that has not been 'chosen' should look for the Recipe that contains that ingredient (and has the highest number of ingredients).
    I'm thinking a loop would work after the first Recipe has been 'chosen', but is this the best way to go about it?
    Aren't loops kind of like using row operations vs set? (I may be demonstrating my level of tinhorn-ness, but I really do want to learn how to write efficient code!)

    Any hints or advice will be greatly appreciated.

  • IMTM2 - Sunday, December 2, 2018 6:50 PM

    Thank you, Phil and Jonathan, for your input!
    Jonathan, You are 100% correct! Using your example, I would only want to select Recipe 2 and 3 since they do include all avail ingredients (and yes, overlaps are allowed). I thought that my wording of the question wasn't terribly clear but you seem to have no problem understanding the end goal.  Your phrasing is better: What would be the most efficient way to choose the smallest number of recipes that together use all avail ingredients (with overlaps allowed).

    It feels like a question of which comes first: chicken or egg?

    So far, I do have a join table containing the RecipeID and IngredientID and I am able to get their counts.
    First step, a stored procedure 'chooses' the Recipe that has the highest number of avail Ingredients and from this, the avail ingredients are 'chosen'.
    Once the ingredients have been "chosen", they aren't looked for again. (Begin Loop?)
    The next Ingredient that has not been 'chosen' should look for the Recipe that contains that ingredient (and has the highest number of ingredients).
    I'm thinking a loop would work after the first Recipe has been 'chosen', but is this the best way to go about it?
    Aren't loops kind of like using row operations vs set? (I may be demonstrating my level of tinhorn-ness, but I really do want to learn how to write efficient code!)

    Any hints or advice will be greatly appreciated.

    You should be able to do nearly all the work just from the join table.
    It also might be worth looking at ingredients that are included in the fewest recipes. If you choose these recipes first it might limit the amount of searching you have to do onwards.

  • This is a variation on the Set Cover Problem and has been proven to be NP-Complete.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply