Leverage window function to break out of triangle join?

  • I have an issue where I wrote a triangle join and then realized that when doing some testing and I want to move away from that solution.
    This is a vendor data structure so I can't change anything about it.

    I do want to make a view to show the "desired output". 

    The issue is this:
    I have a table, the table holds information on containers, (think like large cargo boxes down to pallets, down to individual boxes).

    One of the fields is a DueDate field.  The issue is this:
    In the table data DueDate is only populated for the top level (1=top and  2 lower than 1, 3 lower than 2, 4 = lowest, etc...)  and it is inherited for every level BELOW that....
    unless a different date is populated.  Then it stops inheriting the parent and that level becomes the parent and it continues down.

    Sort of like a rolling DOWN , roll down until you hit new value then use the new and roll it down .
    It rolls down as a ragged hierarchy and each "." in the ContainerID indicates a deeper level.  So while multiple rows are level 3, they have different heritage.  Z7281.2.A1 shares the same progenitor (Z7281) but have different children: Z7281.2.A1 stops at Depth 3 but it's sibling Z7281.2.A3 continues on to have additional children: Z7281.2.A3.78,79,80,etc...  (ragged hierarchy)

     

    Note on the left only a few of those dates are populated.  On the right the color coding matches the "roll down".

    I had tried joining on depth 1  (as anchor) values using LIKE 'ContainerID ' + '%' so it would join to the top level and iterate down through all the children.  *EDIT: forgot to say I did the like AND depth = depth+1 from the anchored CTE

    However, that triangle joins out of control.

    I am wondering what other approaches would be?

    I could do a cursor loop but I REALLY wanted to avoid that and just handle it in a view.  I was thinking LEAD LAG or previous with the windowing functions?

    Thoughts or suggestions?

    Thanks!!!

    Create table and sample data attached.

    EDIT: Resorted sample data via ID:

  • LEAD/LAG will not work, because you need to specify the distance from the current row, and the distance isn't constant.  You also made it more difficult to see the correct solution by ordering by the depth instead of by the ContainerID (which incorporates the hierarchy).

    The solution below converts values to BINARY, because you don't need to worry about internationalization, specifically date formatting for converting back and forth to (n)(var)char.

    ;
    WITH CTE AS
    (
        SELECT *, CAST(ContainerID AS BINARY(50)) + CAST(DueDate AS BINARY(5)) AS val
        FROM #ContainerDepth
    )
    SELECT *, CAST(SUBSTRING(MAX(val) OVER(ORDER BY ContainerID ROWS UNBOUNDED PRECEDING),51,5) AS DATE)
    FROM CTE
    ORDER BY ContainerID

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • Some refinement is possible.  For example, for any single column that has a unique index on it, you wouldn't need to count the values, instead simply get the number of rows in the index.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • drew.allen - Monday, September 25, 2017 1:49 PM

    LEAD/LAG will not work, because you need to specify the distance from the current row, and the distance isn't constant.  You also made it more difficult to see the correct solution by ordering by the depth instead of by the ContainerID (which incorporates the hierarchy).

    The solution below converts values to BINARY, because you don't need to worry about internationalization, specifically date formatting for converting back and forth to (n)(var)char.

    ;
    WITH CTE AS
    (
        SELECT *, CAST(ContainerID AS BINARY(50)) + CAST(DueDate AS BINARY(5)) AS val
        FROM #ContainerDepth
    )
    SELECT *, CAST(SUBSTRING(MAX(val) OVER(ORDER BY ContainerID ROWS UNBOUNDED PRECEDING),51,5) AS DATE)
    FROM CTE
    ORDER BY ContainerID

    Drew

    Thank you for your help!

    Unfortunately, it doesn't work for some of the ContainerIDs that are "siblings" such as the last 4.

    Doing just a sorted roll down doesn't work in this case because row 19 should have inherited from row 13 still (row 18 had its own specified so it would use that)

    Row 20 would inherit from row 3 and rows 21 and 22 would inherit from row 21 as row 21 had its own date specified.

  • In that case, you want to use a recursive CTE.  I don't have the time right now to look at it.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • Maxer - Monday, September 25, 2017 1:17 PM

    I have an issue where I wrote a triangle join and then realized that when doing some testing and I want to move away from that solution.
    This is a vendor data structure so I can't change anything about it.

    I do want to make a view to show the "desired output". 

    The issue is this:
    I have a table, the table holds information on containers, (think like large cargo boxes down to pallets, down to individual boxes).

    One of the fields is a DueDate field.  The issue is this:
    In the table data DueDate is only populated for the top level (1=top and  2 lower than 1, 3 lower than 2, 4 = lowest, etc...)  and it is inherited for every level BELOW that....
    unless a different date is populated.  Then it stops inheriting the parent and that level becomes the parent and it continues down.

    Sort of like a rolling DOWN , roll down until you hit new value then use the new and roll it down .
    It rolls down as a ragged hierarchy and each "." in the ContainerID indicates a deeper level.  So while multiple rows are level 3, they have different heritage.  Z7281.2.A1 shares the same progenitor (Z7281) but have different children: Z7281.2.A1 stops at Depth 3 but it's sibling Z7281.2.A3 continues on to have additional children: Z7281.2.A3.78,79,80,etc...  (ragged hierarchy)

     

    Note on the left only a few of those dates are populated.  On the right the color coding matches the "roll down".

    I had tried joining on depth 1  (as anchor) values using LIKE 'ContainerID ' + '%' so it would join to the top level and iterate down through all the children.  *EDIT: forgot to say I did the like AND depth = depth+1 from the anchored CTE

    However, that triangle joins out of control.

    I am wondering what other approaches would be?

    I could do a cursor loop but I REALLY wanted to avoid that and just handle it in a view.  I was thinking LEAD LAG or previous with the windowing functions?

    Thoughts or suggestions?

    Thanks!!!

    Create table and sample data attached.

    EDIT: Resorted sample data via ID:

    Just an FYI, but you appear to have data that disagrees with itself in the last couple of rows, where the right-most column should be 2018/07/01, right?

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • sgmunson - Tuesday, September 26, 2017 9:43 AM

    Maxer - Monday, September 25, 2017 1:17 PM

    I have an issue where I wrote a triangle join and then realized that when doing some testing and I want to move away from that solution.
    This is a vendor data structure so I can't change anything about it.

    I do want to make a view to show the "desired output". 

    The issue is this:
    I have a table, the table holds information on containers, (think like large cargo boxes down to pallets, down to individual boxes).

    One of the fields is a DueDate field.  The issue is this:
    In the table data DueDate is only populated for the top level (1=top and  2 lower than 1, 3 lower than 2, 4 = lowest, etc...)  and it is inherited for every level BELOW that....
    unless a different date is populated.  Then it stops inheriting the parent and that level becomes the parent and it continues down.

    Sort of like a rolling DOWN , roll down until you hit new value then use the new and roll it down .
    It rolls down as a ragged hierarchy and each "." in the ContainerID indicates a deeper level.  So while multiple rows are level 3, they have different heritage.  Z7281.2.A1 shares the same progenitor (Z7281) but have different children: Z7281.2.A1 stops at Depth 3 but it's sibling Z7281.2.A3 continues on to have additional children: Z7281.2.A3.78,79,80,etc...  (ragged hierarchy)

     

    Note on the left only a few of those dates are populated.  On the right the color coding matches the "roll down".

    I had tried joining on depth 1  (as anchor) values using LIKE 'ContainerID ' + '%' so it would join to the top level and iterate down through all the children.  *EDIT: forgot to say I did the like AND depth = depth+1 from the anchored CTE

    However, that triangle joins out of control.

    I am wondering what other approaches would be?

    I could do a cursor loop but I REALLY wanted to avoid that and just handle it in a view.  I was thinking LEAD LAG or previous with the windowing functions?

    Thoughts or suggestions?

    Thanks!!!

    Create table and sample data attached.

    EDIT: Resorted sample data via ID:

    Just an FYI, but you appear to have data that disagrees with itself in the last couple of rows, where the right-most column should be 2018/07/01, right?

    Oh wow, yes absolutely correct.

    Updating pics 🙂


Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply