Trying to select both "paths" through data

  • I'm working with one of the developers, he has an application which logs the "path" taken by a process.  The process can take multiple parallel paths as well as loop back to previous steps.  What he's hoping can be done, is a query to return the path(s) taken by the process.  Currently, the process is linear, with only one path from point A to point B.
    So, first, the desired results (which at this point I do not believe is going to be possible:)
    CandidateID  ParentStep  ChildStep
    A                   1                   2
    A                   2                   3
    A                   3                   4
    A                   3                   5
    A                   4                   9
    A                   5                   6
    A                   6                   7
    A                   7                   5
    A                   5                   9
    Step 9 is the final step and yes, there's no "parent" for step 1 which I feel is wrong.  Bear in mind, this is not in production, or even started being coded up.
    Next up, test data and what I've been messing with to try to get it to work.
    create table #testdata (
    CandidateID char(1)
    , ParentStep int
    , ChildStep int
    );

    insert into #testdata values
    ('A', 1, 2)
    , ('A', 2, 3)
    , ('A', 3, 4)
    , ('A', 4, 9)
    , ('A', 3, 5)
    , ('A', 5, 6)
    , ('A', 6, 7)
    , ('A', 7, 5)
    , ('A', 5, 9);

    select *
    from #testdata
    order by parentStep;

    with cte_data as (
    select candidateid
     , parentStep
     , childStep
    from #testdata
    where parentStep is null
    union all
    select td.candidateid
     , td.parentStep
     , td.childStep
    from #testdata as td
    where td.parentStep is not null
    )
    select candidateid
    , parentStep
    , ChildStep
    from cte_data
    order by parentStep

    OK, as can be seen I though maybe a recursive CTE might do the trick, but it doesn't it spits out the same results as a plain SELECT * from the data.

    I strongly suspect (and have told the dev) that what he wants won't be possible, not with a SQL query, as it requires some of the data to be "repeated" from the source data, which won't happen, as far as SQL is concerned, it's not even "the same data," it just is.

    So, what I'm looking for is answers to a couple questions:
    1.  Is what the dev wants even possible, in a reasonably performant manner, from a straight-up SQL query?  (I suspect, no)
    2.  Is there even a way to do what the dev wants with a SQL query?  (Again, I suspect no)

    Thanks all.

  • So, based on the sample data what is the expected output?

  • Lynn Pettis - Wednesday, May 23, 2018 1:18 PM

    So, based on the sample data what is the expected output?

    Expected / desired output was at the top of the post, but reposted below (and formatted nicely!)

    CandidateID ParentStep ChildStep
    A            1             2
    A            2             3
    A            3             4
    A            3             5
    A            4             9
    A            5             6
    A            6             7
    A            7             5
    A            5             9

  • jasona.work - Wednesday, May 23, 2018 1:26 PM

    Lynn Pettis - Wednesday, May 23, 2018 1:18 PM

    So, based on the sample data what is the expected output?

    Expected / desired output was at the top of the post, but reposted below (and formatted nicely!)

    CandidateID ParentStep ChildStep
    A            1             2
    A            2             3
    A            3             4
    A            3             5
    A            4             9
    A            5             6
    A            6             7
    A            7             5
    A            5             9

    So tell us why the combination of 5 9 is at the bottom of the list ?   Everything earlier than that follows a very predictable pattern, as well as everything after...

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • sgmunson - Friday, May 25, 2018 6:07 AM

    jasona.work - Wednesday, May 23, 2018 1:26 PM

    Lynn Pettis - Wednesday, May 23, 2018 1:18 PM

    So, based on the sample data what is the expected output?

    Expected / desired output was at the top of the post, but reposted below (and formatted nicely!)

    CandidateID ParentStep ChildStep
    A            1             2
    A            2             3
    A            3             4
    A            3             5
    A            4             9
    A            5             6
    A            6             7
    A            7             5
    A            5             9

    So tell us why the combination of 5 9 is at the bottom of the list ?   Everything earlier than that follows a very predictable pattern, as well as everything after...

    That wound up being one of the reasons I told the Dev he wasn't going to be able to get the data out of the system with a simple SQL query.  Basically, each of the numbers represents a step in the candidate process, which can "loop back" on itself (the 7-5 part) then skip the intervening steps on the way back through (the 5-9 part.)

    Which, when I looked at it and thought it through more, I realized that it wasn't going to be possible (SQL would either order the results by one of the three columns OR return the data in no particular order, but there was NO WAY to return it the way he wanted.)

    So he's going to be looking at other ways to do what he wants, possibly on the application side (which brought him and I to a discussion of either pulling all the data to the application at once, or pulling it one "page" at a time.  I lean towards all at once as I feel it would give a better user experience, a bit longer up front, faster to page through the results, he leans towards loading one "page" at a time)  So we'll see where this goes.

  • jasona.work - Friday, May 25, 2018 6:21 AM

    sgmunson - Friday, May 25, 2018 6:07 AM

    jasona.work - Wednesday, May 23, 2018 1:26 PM

    Lynn Pettis - Wednesday, May 23, 2018 1:18 PM

    So, based on the sample data what is the expected output?

    Expected / desired output was at the top of the post, but reposted below (and formatted nicely!)

    CandidateID ParentStep ChildStep
    A            1             2
    A            2             3
    A            3             4
    A            3             5
    A            4             9
    A            5             6
    A            6             7
    A            7             5
    A            5             9

    So tell us why the combination of 5 9 is at the bottom of the list ?   Everything earlier than that follows a very predictable pattern, as well as everything after...

    That wound up being one of the reasons I told the Dev he wasn't going to be able to get the data out of the system with a simple SQL query.  Basically, each of the numbers represents a step in the candidate process, which can "loop back" on itself (the 7-5 part) then skip the intervening steps on the way back through (the 5-9 part.)

    Which, when I looked at it and thought it through more, I realized that it wasn't going to be possible (SQL would either order the results by one of the three columns OR return the data in no particular order, but there was NO WAY to return it the way he wanted.)

    So he's going to be looking at other ways to do what he wants, possibly on the application side (which brought him and I to a discussion of either pulling all the data to the application at once, or pulling it one "page" at a time.  I lean towards all at once as I feel it would give a better user experience, a bit longer up front, faster to page through the results, he leans towards loading one "page" at a time)  So we'll see where this goes.

    Okay, but why is this the only data being kept?   I have no idea what kind of process this data could possibly represent, and I'm guessing we're not seeing any of the genuinely relevant detail, as this just sounds like a truly crappy database design, or just a bad idea to begin with.   This might be because I don't see enough detail to have any idea of WHY the objective is what it is, nor what constrains the data to this in the first place.   Splain, Lucy!  (pardon the reference to a 1950's sitcom named I Love Lucy, starring Desi Arnez and Lucille Ball.  It was just too appropriate to resist,)

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • sgmunson - Friday, May 25, 2018 6:46 AM

    jasona.work - Friday, May 25, 2018 6:21 AM

    That wound up being one of the reasons I told the Dev he wasn't going to be able to get the data out of the system with a simple SQL query.  Basically, each of the numbers represents a step in the candidate process, which can "loop back" on itself (the 7-5 part) then skip the intervening steps on the way back through (the 5-9 part.)

    Which, when I looked at it and thought it through more, I realized that it wasn't going to be possible (SQL would either order the results by one of the three columns OR return the data in no particular order, but there was NO WAY to return it the way he wanted.)

    So he's going to be looking at other ways to do what he wants, possibly on the application side (which brought him and I to a discussion of either pulling all the data to the application at once, or pulling it one "page" at a time.  I lean towards all at once as I feel it would give a better user experience, a bit longer up front, faster to page through the results, he leans towards loading one "page" at a time)  So we'll see where this goes.

    Okay, but why is this the only data being kept?   I have no idea what kind of process this data could possibly represent, and I'm guessing we're not seeing any of the genuinely relevant detail, as this just sounds like a truly crappy database design, or just a bad idea to begin with.   This might be because I don't see enough detail to have any idea of WHY the objective is what it is, nor what constrains the data to this in the first place.   Splain, Lucy!  (pardon the reference to a 1950's sitcom named I Love Lucy, starring Desi Arnez and Lucille Ball.  It was just too appropriate to resist,)

    I'm sorry Ricky!!!
    (I Love Lucy was still on, at least in reruns, when I was a kid.)
    And you are correct, this is only a tiny bit of the data, enough to get the idea what's trying to be accomplished.  There are already tables with all the relevant candidate information in them, this data is basically one step past pseudo-code.  Because of where I work, I don't think I could post the actual (existing) table structures, not if I want to keep my job...
    Even this (possible) solution was myself and the dev spitballing ideas, the structure of the table was something I ginned up to give this a try.

    So at present, the process only has one way through, with the occasional return to a previous step, which the dev can keep track of and report on (I don't entirely agree with his method, but it's working, the table that records the information has an identity column which is what he uses to keep things in order.)  But the new system for processing candidates will allow for some steps to occur in parallel (and only once) with other steps, as well as looping back to previous steps, as well as (eventually) many steps concurring in parallel...

  • jasona.work - Friday, May 25, 2018 1:01 PM

    sgmunson - Friday, May 25, 2018 6:46 AM

    jasona.work - Friday, May 25, 2018 6:21 AM

    That wound up being one of the reasons I told the Dev he wasn't going to be able to get the data out of the system with a simple SQL query.  Basically, each of the numbers represents a step in the candidate process, which can "loop back" on itself (the 7-5 part) then skip the intervening steps on the way back through (the 5-9 part.)

    Which, when I looked at it and thought it through more, I realized that it wasn't going to be possible (SQL would either order the results by one of the three columns OR return the data in no particular order, but there was NO WAY to return it the way he wanted.)

    So he's going to be looking at other ways to do what he wants, possibly on the application side (which brought him and I to a discussion of either pulling all the data to the application at once, or pulling it one "page" at a time.  I lean towards all at once as I feel it would give a better user experience, a bit longer up front, faster to page through the results, he leans towards loading one "page" at a time)  So we'll see where this goes.

    Okay, but why is this the only data being kept?   I have no idea what kind of process this data could possibly represent, and I'm guessing we're not seeing any of the genuinely relevant detail, as this just sounds like a truly crappy database design, or just a bad idea to begin with.   This might be because I don't see enough detail to have any idea of WHY the objective is what it is, nor what constrains the data to this in the first place.   Splain, Lucy!  (pardon the reference to a 1950's sitcom named I Love Lucy, starring Desi Arnez and Lucille Ball.  It was just too appropriate to resist,)

    I'm sorry Ricky!!!
    (I Love Lucy was still on, at least in reruns, when I was a kid.)
    And you are correct, this is only a tiny bit of the data, enough to get the idea what's trying to be accomplished.  There are already tables with all the relevant candidate information in them, this data is basically one step past pseudo-code.  Because of where I work, I don't think I could post the actual (existing) table structures, not if I want to keep my job...
    Even this (possible) solution was myself and the dev spitballing ideas, the structure of the table was something I ginned up to give this a try.

    So at present, the process only has one way through, with the occasional return to a previous step, which the dev can keep track of and report on (I don't entirely agree with his method, but it's working, the table that records the information has an identity column which is what he uses to keep things in order.)  But the new system for processing candidates will allow for some steps to occur in parallel (and only once) with other steps, as well as looping back to previous steps, as well as (eventually) many steps concurring in parallel...

    With that additional information, I guess maybe the real question is what business process is going to record the action of progressing from any given step in the process to a valid next step?    You can certainly (and easily) build a table that has all possible VALID steps in the process, along with all possible NEXT steps for the step a given record represents, and then constrain the application to logging only those specific progressions, and just have a datetime value record WHEN that occurred.   Thus I'm not really sure what the objective here is, as recording steps isn't some huge challenge.   How to constrain them is also trivial, so again, what's the real objective of the existence of the table?   If I just had a table with a step value for the beginning step and a step value for the next step, I could validate that it's valid just by having a table where all possible step progressions are stored as step pairs.   It might be a lot harder to do that if there is ever any kind of constraint based on anything other than the previous step, such as the number of times a given step has been used in the process...   Such might require a trigger.

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • sgmunson - Tuesday, May 29, 2018 11:06 AM

    jasona.work - Friday, May 25, 2018 1:01 PM

    sgmunson - Friday, May 25, 2018 6:46 AM

    jasona.work - Friday, May 25, 2018 6:21 AM

    That wound up being one of the reasons I told the Dev he wasn't going to be able to get the data out of the system with a simple SQL query.  Basically, each of the numbers represents a step in the candidate process, which can "loop back" on itself (the 7-5 part) then skip the intervening steps on the way back through (the 5-9 part.)

    Which, when I looked at it and thought it through more, I realized that it wasn't going to be possible (SQL would either order the results by one of the three columns OR return the data in no particular order, but there was NO WAY to return it the way he wanted.)

    So he's going to be looking at other ways to do what he wants, possibly on the application side (which brought him and I to a discussion of either pulling all the data to the application at once, or pulling it one "page" at a time.  I lean towards all at once as I feel it would give a better user experience, a bit longer up front, faster to page through the results, he leans towards loading one "page" at a time)  So we'll see where this goes.

    Okay, but why is this the only data being kept?   I have no idea what kind of process this data could possibly represent, and I'm guessing we're not seeing any of the genuinely relevant detail, as this just sounds like a truly crappy database design, or just a bad idea to begin with.   This might be because I don't see enough detail to have any idea of WHY the objective is what it is, nor what constrains the data to this in the first place.   Splain, Lucy!  (pardon the reference to a 1950's sitcom named I Love Lucy, starring Desi Arnez and Lucille Ball.  It was just too appropriate to resist,)

    I'm sorry Ricky!!!
    (I Love Lucy was still on, at least in reruns, when I was a kid.)
    And you are correct, this is only a tiny bit of the data, enough to get the idea what's trying to be accomplished.  There are already tables with all the relevant candidate information in them, this data is basically one step past pseudo-code.  Because of where I work, I don't think I could post the actual (existing) table structures, not if I want to keep my job...
    Even this (possible) solution was myself and the dev spitballing ideas, the structure of the table was something I ginned up to give this a try.

    So at present, the process only has one way through, with the occasional return to a previous step, which the dev can keep track of and report on (I don't entirely agree with his method, but it's working, the table that records the information has an identity column which is what he uses to keep things in order.)  But the new system for processing candidates will allow for some steps to occur in parallel (and only once) with other steps, as well as looping back to previous steps, as well as (eventually) many steps concurring in parallel...

    With that additional information, I guess maybe the real question is what business process is going to record the action of progressing from any given step in the process to a valid next step?    You can certainly (and easily) build a table that has all possible VALID steps in the process, along with all possible NEXT steps for the step a given record represents, and then constrain the application to logging only those specific progressions, and just have a datetime value record WHEN that occurred.   Thus I'm not really sure what the objective here is, as recording steps isn't some huge challenge.   How to constrain them is also trivial, so again, what's the real objective of the existence of the table?   If I just had a table with a step value for the beginning step and a step value for the next step, I could validate that it's valid just by having a table where all possible step progressions are stored as step pairs.   It might be a lot harder to do that if there is ever any kind of constraint based on anything other than the previous step, such as the number of times a given step has been used in the process...   Such might require a trigger.

    I think the issue is that some steps can sometimes lead to a step not in sequence (IE, 4 can lead to 9.)
    Keeping track of which step led where isn't the issue, where the problem came in was the dev in question wanted to be able to pull out *all* the paths the process took with a single query.

    I think I see what you're suggesting could be done, but my understanding from the dev is that the process won't *always* follow the same progression (yeah, horrible business process, I think it more relates to when certain tasks and processes complete.)

    Thanks again for looking at this, but at this point the dev is going a different route to get the data out the way he wants.

  • jasona.work - Tuesday, May 29, 2018 1:08 PM

    sgmunson - Tuesday, May 29, 2018 11:06 AM

    jasona.work - Friday, May 25, 2018 1:01 PM

    sgmunson - Friday, May 25, 2018 6:46 AM

    jasona.work - Friday, May 25, 2018 6:21 AM

    That wound up being one of the reasons I told the Dev he wasn't going to be able to get the data out of the system with a simple SQL query.  Basically, each of the numbers represents a step in the candidate process, which can "loop back" on itself (the 7-5 part) then skip the intervening steps on the way back through (the 5-9 part.)

    Which, when I looked at it and thought it through more, I realized that it wasn't going to be possible (SQL would either order the results by one of the three columns OR return the data in no particular order, but there was NO WAY to return it the way he wanted.)

    So he's going to be looking at other ways to do what he wants, possibly on the application side (which brought him and I to a discussion of either pulling all the data to the application at once, or pulling it one "page" at a time.  I lean towards all at once as I feel it would give a better user experience, a bit longer up front, faster to page through the results, he leans towards loading one "page" at a time)  So we'll see where this goes.

    Okay, but why is this the only data being kept?   I have no idea what kind of process this data could possibly represent, and I'm guessing we're not seeing any of the genuinely relevant detail, as this just sounds like a truly crappy database design, or just a bad idea to begin with.   This might be because I don't see enough detail to have any idea of WHY the objective is what it is, nor what constrains the data to this in the first place.   Splain, Lucy!  (pardon the reference to a 1950's sitcom named I Love Lucy, starring Desi Arnez and Lucille Ball.  It was just too appropriate to resist,)

    I'm sorry Ricky!!!
    (I Love Lucy was still on, at least in reruns, when I was a kid.)
    And you are correct, this is only a tiny bit of the data, enough to get the idea what's trying to be accomplished.  There are already tables with all the relevant candidate information in them, this data is basically one step past pseudo-code.  Because of where I work, I don't think I could post the actual (existing) table structures, not if I want to keep my job...
    Even this (possible) solution was myself and the dev spitballing ideas, the structure of the table was something I ginned up to give this a try.

    So at present, the process only has one way through, with the occasional return to a previous step, which the dev can keep track of and report on (I don't entirely agree with his method, but it's working, the table that records the information has an identity column which is what he uses to keep things in order.)  But the new system for processing candidates will allow for some steps to occur in parallel (and only once) with other steps, as well as looping back to previous steps, as well as (eventually) many steps concurring in parallel...

    With that additional information, I guess maybe the real question is what business process is going to record the action of progressing from any given step in the process to a valid next step?    You can certainly (and easily) build a table that has all possible VALID steps in the process, along with all possible NEXT steps for the step a given record represents, and then constrain the application to logging only those specific progressions, and just have a datetime value record WHEN that occurred.   Thus I'm not really sure what the objective here is, as recording steps isn't some huge challenge.   How to constrain them is also trivial, so again, what's the real objective of the existence of the table?   If I just had a table with a step value for the beginning step and a step value for the next step, I could validate that it's valid just by having a table where all possible step progressions are stored as step pairs.   It might be a lot harder to do that if there is ever any kind of constraint based on anything other than the previous step, such as the number of times a given step has been used in the process...   Such might require a trigger.

    I think the issue is that some steps can sometimes lead to a step not in sequence (IE, 4 can lead to 9.)
    Keeping track of which step led where isn't the issue, where the problem came in was the dev in question wanted to be able to pull out *all* the paths the process took with a single query.

    I think I see what you're suggesting could be done, but my understanding from the dev is that the process won't *always* follow the same progression (yeah, horrible business process, I think it more relates to when certain tasks and processes complete.)

    Thanks again for looking at this, but at this point the dev is going a different route to get the data out the way he wants.

    Well, if it's possible to go to a step that's "out of sequence", then it's up to the business to decide what the valid sequences are.  However, letting the business just do whatever the heck it wants via the developer is just a shortcut to an eventual nightmare.   By then, it could well be almost impossible to fix.   Better to clamp down on that now and enforce some discipline on the business';use of IT personnel.   If IT management won't do it, then they should be the ones to take the blame when it breaks.

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • sgmunson - Tuesday, May 29, 2018 2:50 PM

    jasona.work - Tuesday, May 29, 2018 1:08 PM

    sgmunson - Tuesday, May 29, 2018 11:06 AM

    jasona.work - Friday, May 25, 2018 1:01 PM

    sgmunson - Friday, May 25, 2018 6:46 AM

    jasona.work - Friday, May 25, 2018 6:21 AM

    That wound up being one of the reasons I told the Dev he wasn't going to be able to get the data out of the system with a simple SQL query.  Basically, each of the numbers represents a step in the candidate process, which can "loop back" on itself (the 7-5 part) then skip the intervening steps on the way back through (the 5-9 part.)

    Which, when I looked at it and thought it through more, I realized that it wasn't going to be possible (SQL would either order the results by one of the three columns OR return the data in no particular order, but there was NO WAY to return it the way he wanted.)

    So he's going to be looking at other ways to do what he wants, possibly on the application side (which brought him and I to a discussion of either pulling all the data to the application at once, or pulling it one "page" at a time.  I lean towards all at once as I feel it would give a better user experience, a bit longer up front, faster to page through the results, he leans towards loading one "page" at a time)  So we'll see where this goes.

    Okay, but why is this the only data being kept?   I have no idea what kind of process this data could possibly represent, and I'm guessing we're not seeing any of the genuinely relevant detail, as this just sounds like a truly crappy database design, or just a bad idea to begin with.   This might be because I don't see enough detail to have any idea of WHY the objective is what it is, nor what constrains the data to this in the first place.   Splain, Lucy!  (pardon the reference to a 1950's sitcom named I Love Lucy, starring Desi Arnez and Lucille Ball.  It was just too appropriate to resist,)

    I'm sorry Ricky!!!
    (I Love Lucy was still on, at least in reruns, when I was a kid.)
    And you are correct, this is only a tiny bit of the data, enough to get the idea what's trying to be accomplished.  There are already tables with all the relevant candidate information in them, this data is basically one step past pseudo-code.  Because of where I work, I don't think I could post the actual (existing) table structures, not if I want to keep my job...
    Even this (possible) solution was myself and the dev spitballing ideas, the structure of the table was something I ginned up to give this a try.

    So at present, the process only has one way through, with the occasional return to a previous step, which the dev can keep track of and report on (I don't entirely agree with his method, but it's working, the table that records the information has an identity column which is what he uses to keep things in order.)  But the new system for processing candidates will allow for some steps to occur in parallel (and only once) with other steps, as well as looping back to previous steps, as well as (eventually) many steps concurring in parallel...

    With that additional information, I guess maybe the real question is what business process is going to record the action of progressing from any given step in the process to a valid next step?    You can certainly (and easily) build a table that has all possible VALID steps in the process, along with all possible NEXT steps for the step a given record represents, and then constrain the application to logging only those specific progressions, and just have a datetime value record WHEN that occurred.   Thus I'm not really sure what the objective here is, as recording steps isn't some huge challenge.   How to constrain them is also trivial, so again, what's the real objective of the existence of the table?   If I just had a table with a step value for the beginning step and a step value for the next step, I could validate that it's valid just by having a table where all possible step progressions are stored as step pairs.   It might be a lot harder to do that if there is ever any kind of constraint based on anything other than the previous step, such as the number of times a given step has been used in the process...   Such might require a trigger.

    I think the issue is that some steps can sometimes lead to a step not in sequence (IE, 4 can lead to 9.)
    Keeping track of which step led where isn't the issue, where the problem came in was the dev in question wanted to be able to pull out *all* the paths the process took with a single query.

    I think I see what you're suggesting could be done, but my understanding from the dev is that the process won't *always* follow the same progression (yeah, horrible business process, I think it more relates to when certain tasks and processes complete.)

    Thanks again for looking at this, but at this point the dev is going a different route to get the data out the way he wants.

    Well, if it's possible to go to a step that's "out of sequence", then it's up to the business to decide what the valid sequences are.  However, letting the business just do whatever the heck it wants via the developer is just a shortcut to an eventual nightmare.   By then, it could well be almost impossible to fix.   Better to clamp down on that now and enforce some discipline on the business';use of IT personnel.   If IT management won't do it, then they should be the ones to take the blame when it breaks.

    The requirement may be for a process state trace. This type of trace may be used in a batch program to verify that all paths in a complex program were executed to verify testing - it has nothing directly to do with SQL. But if the SQL implementation is for a complex process that requires best fit selection for multiple conditions some iteration may be required if the conditions are correlated, for example to select a vacation package with the best price or multiple date ranges - flight, seats, direct or one or more stops, auto rental, hotel, etc. The result may be an exact match, a close match, or a choice of multiple matches. A state trace can be used to tune or verify the selection algorithm. For a SQL process the easiest method is to declare a string
    DECLARE @trc varchar(500)
    and append a ',n' at each decision where n is the is the step or decision number. Output the varchar when the processing is complete for each request and parse and analyze later. If the SQL code includes GO statements then a temp table is required. The varchar contains the sequence of various SQL steps and captures the iterations. The string may be parsed by the included commas or output as a CSV file when processing is complete.

  • If it's analysing data produced from a system that has run to completion, I think it can be achieved but you probably need to record additional data about the order of events taken so that you can understand which destination was taken as a result of which visit to any given step.

    If, on the other hand, you have a list of possible paths and want to show every way you can go through the system then the ability to loop back to a previous step effectively reduces it to the halting problem which can be proven impossible to solve.

  • andycadley - Thursday, May 31, 2018 3:23 PM

    If it's analysing data produced from a system that has run to completion, I think it can be achieved but you probably need to record additional data about the order of events taken so that you can understand which destination was taken as a result of which visit to any given step.

    If, on the other hand, you have a list of possible paths and want to show every way you can go through the system then the ability to loop back to a previous step effectively reduces it to the halting problem which can be proven impossible to solve.

    And further, any attempt to get ALL POSSIBLE paths when there's no limit to which step you can actually go has an infinite number of possible path sequences, and regardless of "solvability", it is at  absolute minimum, impractical.   For this reason, most businesses impose business process on IT systems to ensure that the data accurately imposes on stored data the proper characteristics and constraints on that data, which in this case would, of NECESSITY, govern the choice of path at ALL times.   Thus this is information that the business should ALREADY be aware of, and deriving it from a query should only be something you do AFTER the data has been properly constrained before being recorded.  If IT systems remain unconstrained by business rules and/or business logic, then don't expect a good solution for this, or perhaps even at all.

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • Jason - -

    This looks like a variant of the adjacency list issue.  Jeff wrote up something a few years back on it.

    http://www.sqlservercentral.com/articles/Hierarchy/94040/

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Matt Miller (4) - Monday, June 4, 2018 12:17 PM

    Jason - -

    This looks like a variant of the adjacency list issue.  Jeff wrote up something a few years back on it.

    http://www.sqlservercentral.com/articles/Hierarchy/94040/

    Yeah, I looked at treating it as a hierarchy initially, but that didn't pan out.  Using the diagram of the employee org chart in Jeff's example, add a line indicating that Kim (EmpID 14) *ALSO* reports to Lisa (EmpID 18)  That seemed to throw a monkey wrench into the solutions.

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply