Recursive User-Defined Function

  • I have a database the uses product trees. A product tree is a Bill of Materials (BOM). A BOM is used as the basis of production orders. When a production order is created the items on the basis BOM are added as items to the production order. The BOM may contain material items, other BOMs, and phantoms. A phantom is a hybrid BOM or list that contains items that commonly go together. By adding a phantom BOM to a BOM the user saves having to add all the items in the phantom. When a BOM that is the basis of a production order contains a sub-BOM, the sub-BOM is added to the production order as one line item. When a BOM that is the basis of a production order contains a phantom, the phantom’s components are added to the production order instead of the phantom itself. The tricky part is that some or all of a phantom’s components can also be a phantom. Because only phantom components that are not themselves phantoms can end up on a production order, it is necessary to recur through the phantom trees until you have only their non-phantom components. After all the recursion is complete, you have a list of all the components of a production order.

    To solve this problem, my initial approach was to use a Common Table Expression (CTE) as a recursive query. But I came to realize that the table the query had to be based on was a many-to-many table; that is, not only can the parents have many children but the children can have many parents. This condition immediately destroys the accuracy of the resulting record set because it will have too many rows.

    My next approach was to use a recursive function in VB .NET. One of the functions parameters is an SQL statement having one field to indicate whether or not that row is a phantom. The function iterates through the record set defined by the SQL statement and adds the row data to the production order. Whenever the value of the phantom indicator is true, the function does not add the row’s data to the production order but instead calls itself with another SQL statement to get the components of that phantom. The process is continued until all of the components of the production have been added to it. The result of this approach is accurate; but it has two main problems: it is very slow and it is prone to database timeout errors. This is because every phantom row encountered is required to perform a call to the database to get its components.

    It then occurred to me that if I could use a T-SQL user-defined function that is both recursive and returns a record set, only one call to the database would be required. If that function could return a record set containing all the final components to be added to the production, I could use VB to iterate through it and add each item to the production order when its row is processed. I don’t know where to start or even if it is possible. We are using SQL Server 2005. Does anyone have any advice that will help me to get started?

  • So is this a custom built system, or are you modifying an off the shelf product?

    I'm making a few assumptions here:

    1. You know how to identify "top level" items.

    2. Your structure lets you find all children one level down from a given level

    What I did to solve a problem very similar to yours was:

    1. create a temp table with your item ids and a "level" variable

    2. seed your temp table with the top level item(s) at level 0 that you want to calculate a BOM for.

    do:

    3. increment level

    4. join from temp table back to source table to find all children where parent item is in level - 1 and insert into temp table with current level

    until records returned is 0

    this lets you explore all branches of all sub boms in a stored procedure, non-recursive way, and at the end, you can simply select everything from your temp table that is not a phantom.

    I hope I understood your problem correctly, and I hope this helps.

  • I frankly don’t understand the intent of your first paragraph. In addition, my issue description says, “We are using SQL Server 2005.” Yes, I am stuck with the database; I did not design it.

    You say did I not follow proper netiquette. Would you please provide some guidance on this? I basically described the issue as I saw it.

    I solved the problem by writing a recursive stored procedure that only needed to recall itself when it encountered a phantom. It was very fast and I only had to connect to the database once and I received a record set that had all the data I needed. I was writing an add-on to an application by a major company which later told me I was not allowed to use a stored procedure of any kind unless they wrote it. I ended up using a recursive CTE which turned out to be slower because I had to connect to the database as often as the top level BOM had a phantom.

    Three and a half years have passed since I posted the issue and was a bit surprised by your response to it. I only recently became familiar with nested sets within a database and have used that approach with recursive CTEs for creating org charts; creating production orders from BOMs is next. The book I’m using for this refers to your SQL for Smarties book but it looks like you have a whole book covering trees and hierarchies in SQL and will consider getting it.

    I don’t understand the Ackermann function and couldn’t find an inabstruse explanation on the Internet. Maybe your book will do a better job of it.

  • CELKO (6/18/2011)


    Avoid procedural code in SQL whenever you can. Recursive CTEs are really cursor loops, not full recursion...

    That's funny coming from a guy who's famous for using a 1960's Push-Stack/While Loop to do a conversion from an Adjacency List to a Nested Set. Speaking of "vague narratives", you really should try something other than that first paragraph. 😉

    People who've met you in person say that you're a nice guy. Try being one online and in your books and maybe your books will sell better.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Mike, don't worry about Joe's accusation of violating Netiquette. He actually, per his own confession, tries to be extremely rude in his online personna. He accuses everyone of violating ISO rules, violating Netiquette, et al, in order to be as offensive as possible. It doesn't actually mean you did anything wrong.

    I see from the timing (three years) and from your reply that you have a solution at this time. If you'd like help improving it, can you post table structures, some sample data (preferably in the form of insert statements instead of just text), and your desired end result of querying the sample data? If what you have is good enough and you aren't still interested in optimization, don't bother, of course. But if you'd like some help with it at this time, please feel free to post that.

    (Joe's an interesting case. He's very technically proficient, and I'm told he's very personable face-to-face, but he's stuck in the idea that rudeness, arrogance, and insisting that people follow "standards" that you have to pay money in order to view, are good things when online.)

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • GSquared (6/20/2011)


    He's very technically proficient

    Considering that he recently recommended FK's on First and LastName columns and he still uses a While Loop/Push Stack to convert from an Adjacency List to a Nested Set, I'll have to disagree with that notion. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (6/20/2011)


    GSquared (6/20/2011)


    He's very technically proficient

    Considering that he recently recommended FK's on First and LastName columns and he still uses a While Loop/Push Stack to convert from an Adjacency List to a Nested Set, I'll have to disagree with that notion. 😉

    Okay. I'll clarify. He displays significant but inconsistent technical proficiency, oddly combined with periods of acute technical WTFness of magnitude.

    Better?

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • GSquared (6/20/2011)


    Jeff Moden (6/20/2011)


    GSquared (6/20/2011)


    He's very technically proficient

    Considering that he recently recommended FK's on First and LastName columns and he still uses a While Loop/Push Stack to convert from an Adjacency List to a Nested Set, I'll have to disagree with that notion. 😉

    Okay. I'll clarify. He displays significant but inconsistent technical proficiency, oddly combined with periods of acute technical WTFness of magnitude.

    Better?

    LOLOLOLOLOLOL!

    Gaby________________________________________________________________"In theory, theory and practice are the same. In practice, they are not." - Albert Einstein

  • Hi GSquared and Jeff,

    Thanks for your kind comments. Jeff, I read both of your links and added a bookmark for each one. They actually contain advice on how to logically think through a problem to come up with a good solution for it at the outset. Posters following your advice may actually find that they are able to solve the problem on their own. In addition, it follows a process often used by authors of books on SQL in order to provide readers some simplified data to help with the learning. That is what I should have done before my original post.

    I am no longer working for the same company but I am interested in T-SQL solutions for graphs, trees, and hierarchies. I’m reading though a book by Itzik Ben-Gan et al, the one I used three year ago for the CTE that I used for my final solution. When I finish the Ben-Gan book, which has only one, though quite thorough chapter about these issues, I’d like to study more. Since it looks like the Celko book may not be ideal (I read some Amazon reviews), do you know of a good source of tutorials I could use for this?

    As a VB .NET developer, I was faced with the need to become very knowledgeable on both .NET and SQL. One requirement was to create production orders based on the hierarchical nature of the BOM table. But I came across a condition where a BOM with only 6 line items, 4 of which were phantoms, required 4 minutes for a production order containing 385 line items to be created from it. The code I had inherited was using a recursive VB function. Every time it encountered a phantom it had to recall itself to query the database to get the phantom's members. I added code to determine how many times the database was being hit and discovered that it was over 100 times for this BOM. I believed and still do that, for data-centric applications, it is best to do as much of the processing right in the database as possible. This principle was not being followed by the VB code. So I looked for a way to hit the database only once and used the recursive stored procedure to do it. The procedure took less than a second to run and the result was that the original 4 minutes was reduced to 1 minute. I couldn’t get lower than that because I had to use their SDK to create the production order based on the record set from the stored procedure. They didn’t want any objects created unless policed by their SDK.

    This experience and others led me to the notion that because many organizations look for developers who could do everything, desktop .NET, ASP .NET, SQL, etc., they are being very foolish. The knowledge required to create major applications is too great for this to work. They end up getting jacks of all trades and masters of none. The resulting applications then are not all that great. This policy is contrary to SOA or any of the other well-regarded architectural designs. I think there needs to be more division of the effort and a major part is the database piece. I think that all the data requesting should be handled by people who know all the abstruse features of databases and the let desktop and ASP developers do their thing without distraction. What do you think?

    I hope I’m not being too verbose here for SQL Server Central, but I think this issue is quite important. If I am, I apologize. The next time I post with a problem issue, I will follow the advice from the links you provided.

    Thanks,

    Mike

  • GSquared (6/20/2011)


    Jeff Moden (6/20/2011)


    GSquared (6/20/2011)


    He's very technically proficient

    Considering that he recently recommended FK's on First and LastName columns and he still uses a While Loop/Push Stack to convert from an Adjacency List to a Nested Set, I'll have to disagree with that notion. 😉

    Okay. I'll clarify. He displays significant but inconsistent technical proficiency, oddly combined with periods of acute technical WTFness of magnitude.

    Better?

    Better? BWAA-HAA!!! I couldn't have said it any better! 😀

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • MikeAngelastro-571287 (6/20/2011)


    Hi GSquared and Jeff,

    Thanks for your kind comments. Jeff, I read both of your links and added a bookmark for each one. They actually contain advice on how to logically think through a problem to come up with a good solution for it at the outset. Posters following your advice may actually find that they are able to solve the problem on their own. In addition, it follows a process often used by authors of books on SQL in order to provide readers some simplified data to help with the learning. That is what I should have done before my original post.

    I am no longer working for the same company but I am interested in T-SQL solutions for graphs, trees, and hierarchies. I’m reading though a book by Itzik Ben-Gan et al, the one I used three year ago for the CTE that I used for my final solution. When I finish the Ben-Gan book, which has only one, though quite thorough chapter about these issues, I’d like to study more. Since it looks like the Celko book may not be ideal (I read some Amazon reviews), do you know of a good source of tutorials I could use for this?

    As a VB .NET developer, I was faced with the need to become very knowledgeable on both .NET and SQL. One requirement was to create production orders based on the hierarchical nature of the BOM table. But I came across a condition where a BOM with only 6 line items, 4 of which were phantoms, required 4 minutes for a production order containing 385 line items to be created from it. The code I had inherited was using a recursive VB function. Every time it encountered a phantom it had to recall itself to query the database to get the phantom's members. I added code to determine how many times the database was being hit and discovered that it was over 100 times for this BOM. I believed and still do that, for data-centric applications, it is best to do as much of the processing right in the database as possible. This principle was not being followed by the VB code. So I looked for a way to hit the database only once and used the recursive stored procedure to do it. The procedure took less than a second to run and the result was that the original 4 minutes was reduced to 1 minute. I couldn’t get lower than that because I had to use their SDK to create the production order based on the record set from the stored procedure. They didn’t want any objects created unless policed by their SDK.

    This experience and others led me to the notion that because many organizations look for developers who could do everything, desktop .NET, ASP .NET, SQL, etc., they are being very foolish. The knowledge required to create major applications is too great for this to work. They end up getting jacks of all trades and masters of none. The resulting applications then are not all that great. This policy is contrary to SOA or any of the other well-regarded architectural designs. I think there needs to be more division of the effort and a major part is the database piece. I think that all the data requesting should be handled by people who know all the abstruse features of databases and the let desktop and ASP developers do their thing without distraction. [font="Arial Black"]What do you think?[/font]

    I hope I’m not being too verbose here for SQL Server Central, but I think this issue is quite important. If I am, I apologize. The next time I post with a problem issue, I will follow the advice from the links you provided.

    Thanks,

    Mike

    What I think is that you're a very wise man, Mike. There are some people on this very forum who are actually outstanding in both worlds. I find that they're they extraordinary exception rather then even coming close to the rule.

    Shifting gears, I've not tried to generate BOM's using "phantoms" as you say. From the sounds of it, you did quite well. Other than the Ben-Gan chapters, I don’t know of any really good tutorials that explain the “big 3” hierarchical structures along with how they can be made to interplay with each other. There are quite a few that explain each kind more-or-less separately. They can usually be found by Googling for “Adjacency List Hierarchy”, “Hierarchical Path”, and “Nested Sets Hierarchy”. Of course, be real careful with the code that accompanies those articles… I found that a lot of the principles given are spot on but the code is sometimes more than a wee-bit performance challenged, silently misses some nodes here and there, or just flat-out doesn’t work until a couple of repairs have been made (or course, making those repairs tends to deepen the understanding quite a bit). With exception of Ben-Gan’s chapters, I’ve found that the more famous the author is, the more likely that one of these problems will be present, so test and validate everything.

    Is there something special you’re looking for in a tutorial on hierarchies?

    Thank you for the kind words on the links in my signature. Gail’s link (the second one) is especially helpful in trying to science-out performance problems. And, as you suggested, both links have inspired some folks to write in and say “I was going to ask a question but those two links helped me figure it out on my own.”

    It's been a real pleasure and I really enjoyed your "miniature article" on this thread. I really do wish more people thought like you do. 🙂

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff,

    Thanks for your kind words. You put a big smile on my face. I don't what else to say but, "Thank you!"

    The tutorials would use both English, that is, be without the fancy math symbols, and provide reasonable examples, that is, they should be simple enough to avoid obfuscation and complex enough to represent reality.

    Thanks again.

    Mike

  • I'm definitely for that. Although I can certainly understand why some people might be interested in all the fancy relational and hierarchical calculus, some folks just want a simple explanation so they can build something and maintain it.

    I did something a while ago and I'm going to convert it to an article (it's finally boiling up to the top of my list of things to do). I don't want to post what I have here because it's not ready as an article, so I'll make you a simple trade... I'll send you what I have if you provide an honest (even if it's bad) critique on the methods and general direction before I turn it into an article, please.

    And don't be such a stranger on these forums. It's a great place to learn new things and maybe even vent a little. I've learned more by solving other people's problems than I could have if I did 1,000 tutorials or had a 1,000 job history.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff,

    I would be happy to review your preliminary article. I'm sure I will learn a great deal in the process. I might even be helpful. 🙂

    I will try not to be a stranger to SQL Server Central. Two of my joys in life are working on puzzles and discussing software development issues.

    Send it on. If you provide a private email address, I'll send it back to you there.

    Regards,

    Mike

  • Mike, on your point about specialization vs Swiss-army-knife devs, yes, more companies would benefit from specialization than are aware of it.

    If you turned that post into an editorial, and sent it to Steve (the editor for this site), I think he'd love it. Would probably generate some interesting discussions too.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

Viewing 15 posts - 1 through 15 (of 18 total)

You must be logged in to reply to this topic. Login to reply