T-SQL puzzler - given a cell in an N-dimensional space, return the cells that are inline with the given cell along each axis

  • The cells in an N-dimensional space are modelled with the 2 tables below. A script is needed that takes a single cell (by CellID) and returns all the other cells that are "inline" with the given cell along each axis (including itself).

    For example, suppose the space has 3 dimensions (X, Y, Z) and X has 2 positions, Y has 2 and Z has 3. If the cell with coordinates {1,1,1} is given, the result should be:

    [font="Courier New"] +----------+---------+

    | AxisCode | Cell |

    +----------+---------+

    | X | {1,1,1} | <- showing coordinates for clarity, but should be CellID

    | X | {2,1,1} |

    | Y | {1,1,1} |

    | Y | {1,2,1} |

    | Z | {1,1,1} |

    | Z | {1,1,2} |

    | Z | {1,1,3} |

    +----------+---------+[/font]

    I have spent hours on this and only come up with queries that are hard-coded for a specific number of dimensions ....

    Please note:

    **Changing the schema of the 2 tables is not an option!

    The script has to work for N dimensions, and should not involve loops or cursors.**

    Compatibility must be MS SQL 2008 R2

    Any ideas gratefully received!

    create table dbo.Cells(

    CellID int not null,

    CellValue int not null,

    constraint PK_Cells primary key (CellID)

    )

    create table dbo.AxisPositions(

    AxisCode char(1) not null,-- X, Y, Z etc

    PositionOnAxis int not null,-- 1, 2, 3, 4 etc

    constraint PK_AxisPositions primary key (AxisCode, PositionOnAxis)

    )

    create table dbo.CellAxes(

    CellID int not null,

    AxisCode char(1) not null,-- X, Y, Z etc

    PositionOnAxis int not null,-- 1, 2, 3, 4 etc

    constraint PK_CellAxes primary key (CellID, AxisCode),

    constraint FK_CellAxes_Cells foreign key (CellID) references Cells(CellID),

    constraint FK_CellAxes_AxisPositions foreign key (AxisCode, PositionOnAxis) references AxisPositions(AxisCode, PositionOnAxis)

    )

    -- Example data

    insert Cells (CellID, CellValue)

    values (1, 67), (2, 45), (3, 0), (4, 4), (5, 78), (6, 213), (7, 546), (8, 455), (9, 12), (10, 67), (11, 4), (12, 5)

    insert AxisPositions (AxisCode, PositionOnAxis)

    values ('X', 1), ('X', 2), ('Y', 1), ('Y', 2), ('Z', 1), ('Z', 2), ('Z', 3)

    insert CellAxes (CellID, AxisCode, PositionOnAxis)

    values (1, 'X', 1), (1, 'Y', 1), (1, 'Z', 1),

    (2, 'X', 2), (2, 'Y', 1), (2, 'Z', 1),

    (3, 'X', 1), (3, 'Y', 2), (3, 'Z', 1),

    (4, 'X', 2), (4, 'Y', 2), (4, 'Z', 1),

    (5, 'X', 1), (5, 'Y', 1), (5, 'Z', 2),

    (6, 'X', 2), (6, 'Y', 1), (6, 'Z', 2),

    (7, 'X', 1), (7, 'Y', 2), (7, 'Z', 2),

    (8, 'X', 2), (8, 'Y', 2), (8, 'Z', 2),

    (9, 'X', 1), (9, 'Y', 1), (9, 'Z', 3),

    (10, 'X', 2), (10, 'Y', 1), (10, 'Z', 3),

    (11, 'X', 1), (11, 'Y', 2), (11, 'Z', 3),

    (12, 'X', 2), (12, 'Y', 2), (12, 'Z', 3)

  • I'm not sure if I fully understood the requirement (especially for N>3)...

    My assumption: a cell is "inline" if all but one value match the cell coordinates in question.

    "All but one" is used to meet the criteria "N-dimensional" .

    Your expected output shows 7 rows, but the cell (1,1,1) is displayed three times, so in fact there are only 5 matching cells.

    Here's what I came up with:

    WITH sample_cell AS

    (

    SELECT 'X' AS AxisCode, 1 AS Pos UNION ALL

    SELECT 'Y' AS AxisCode, 1 AS Pos UNION ALL

    SELECT 'Z' AS AxisCode, 1 AS Pos

    )

    SELECT

    cellid

    FROM CellAxes ca

    INNER JOIN sample_cell on ca.AxisCode = sample_cell.AxisCode AND sample_cell.Pos=ca.PositionOnAxis

    GROUP BY cellid

    HAVING COUNT(*) >= (SELECT COUNT(*) -1 AS sub FROM sample_cell)

    It's basically the SQL logic format of the above "business logic". The cte sample_cell is used to describe the given cell. Please not that there's one row per axis.

    This basic solution does not include any test whether all axis are provided or if the given values are not valid as per the rows in AxisPositions. But this check could be added, if required...

    Edit: if you want to remove the original cell, just change the comparison from >= to =



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Wow! That is pretty close. However, what I needed was this result set:

    +----------+---------+

    | AxisCode | CellID |

    +----------+---------+

    | X | 1 |

    | X | 2 |

    | Y | 1 |

    | Y | 3 |

    | Z | 1 |

    | Z | 5 |

    | Z | 9 |

    +----------+---------+

    i.e. "if you walk the X dimension you get cells 1 & 2; walk Y you get 1 & 3; walk Z you get 1, 5 & 9. I do actually need to get cell 1 in each dimension.

  • I now have the result I need. I combined your approach with someone else's help on StackOverflow and came up with this:

    declare @CellID int = 1

    ;with SelectedCell as

    (

    select CellID, AxisCode, PositionOnAxis

    from CellAxes

    where CellID = @CellID

    )

    select x.AxisCode, a2.CellID

    from SelectedCell a1

    join CellAxes a2 on a2.AxisCode = a1.AxisCode

    join CellAxes x on x.CellID = a1.CellID

    where (a1.AxisCode = x.AxisCode or a1.PositionOnAxis = a2.PositionOnAxis)

    group by x.AxisCode, a2.CellID

    having count(*) = (select count(AxisCode) from SelectedCell)

    order by x.AxisCode, a2.CellID

    Thanks for your help

  • ok, it's getting slightly more complicated:

    WITH sample_cell AS

    (

    SELECT 'X' AS AxisCode, 1 AS Pos UNION ALL

    SELECT 'Y' AS AxisCode, 1 AS Pos UNION ALL

    SELECT 'Z' AS AxisCode, 1 AS Pos

    ), cte_almost as

    (

    SELECT

    cellid

    FROM CellAxes ca

    INNER JOIN sample_cell on ca.AxisCode = sample_cell.AxisCode AND sample_cell.Pos=ca.PositionOnAxis

    GROUP BY cellid

    HAVING COUNT(*) = (SELECT COUNT(*) -1 AS sub FROM sample_cell)

    ), cte_unsortet as

    (

    SELECT

    MAX(CASE WHEN sc.pos <> ca.PositionOnAxis THEN ca.AxisCode ELSE NULL END) as Axis,

    cte_almost.cellid

    FROM cte_almost

    INNER JOIN CellAxes ca ON cte_almost.cellid = ca.Cellid

    LEFT OUTER JOIN sample_cell sc ON ca.axisCode=sc.AxisCode

    GROUP BY cte_almost.cellid

    UNION ALL

    SELECT AxisCode, Pos

    FROM sample_cell

    )

    SELECT * FROM cte_unsortet ORDER BY Axis, cellid

    The cellid extracted earlier now are used to join them ack to the original table as well as the source cell. Please note that I changed it from COUNT(*) >= (SELECT ... to COUNT(*) = (SELECT to eliminate the cource cell id.

    Then a CROSS TAB (or pivot) is performed where the axis name is derived from the axis position that does not match the source cell together with the cellid. (Almost) Finally, the source cell is simply added to the result set using a UNION ALL query.

    At the very end, the result set is sorted as per the requirement.



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • I'm not sure if I am understanding the requirements for this correctly, but would this work?

    declare @cellid int = 1;

    select *

    from CellAxes c

    inner join (

    select cellid

    from (select axiscode, positiononaxis from CellAxes where cellid = @cellid) a

    inner join CellAxes b on a.axisCode = b.axisCode

    group by cellid having sum(case when a.positiononaxis = b.positiononaxis then 0 else 1 end) <= 1

    ) il on c.cellid = il.cellid

  • Laurence Neville (11/14/2013)


    I now have the result I need. I combined your approach with someone else's help on StackOverflow and came up with this:

    declare @CellID int = 1

    ;with SelectedCell as

    (

    select CellID, AxisCode, PositionOnAxis

    from CellAxes

    where CellID = @CellID

    )

    select x.AxisCode, a2.CellID

    from SelectedCell a1

    join CellAxes a2 on a2.AxisCode = a1.AxisCode

    join CellAxes x on x.CellID = a1.CellID

    where (a1.AxisCode = x.AxisCode or a1.PositionOnAxis = a2.PositionOnAxis)

    group by x.AxisCode, a2.CellID

    having count(*) = (select count(AxisCode) from SelectedCell)

    order by x.AxisCode, a2.CellID

    Thanks for your help

    This may be an alternate solution:

    WITH RowsToAxes AS

    (

    SELECT CellID

    ,X=MAX(CASE WHEN AxisCode = 'X' THEN PositionOnAxis END)

    ,Y=MAX(CASE WHEN AxisCode = 'Y' THEN PositionOnAxis END)

    ,Z=MAX(CASE WHEN AxisCode = 'Z' THEN PositionOnAxis END)

    FROM CellAxes

    GROUP BY CellID

    )

    SELECT b.AxisCode, b.CellID

    FROM

    (

    SELECT CellID, a.X, a.Y, a.Z, X1, Y1, Z1

    ,C=X1 + Y1 + Z1

    FROM RowsToAxes a

    CROSS APPLY

    (

    SELECT X1=CASE X WHEN a.X THEN 1 ELSE 0 END

    ,Y1=CASE Y WHEN a.Y THEN 1 ELSE 0 END

    ,Z1=CASE Z WHEN a.Z THEN 1 ELSE 0 END

    FROM RowsToAxes

    WHERE CellID = @CellID

    ) b

    ) a

    CROSS APPLY

    (

    VALUES (CASE C WHEN 3 THEN CellID ELSE (1-X1)*CellID END, 'X')

    ,(CASE C WHEN 3 THEN CellID ELSE (1-Y1)*CellID END, 'Y')

    ,(CASE C WHEN 3 THEN CellID ELSE (1-Z1)*CellID END, 'Z')

    ) b (CellID, AxisCode)

    WHERE C IN (2,3) AND b.CellID <> 0;

    Not sure if it is 100% correct but it does return the desired results for this case. Compared to your original solution, it looks more complicated, but the execution plan gives me:

    Original: Cost=41%, Clustered INDEX Scans=1, Clustered INDEX Seeks=3

    MickyT: Cost=18%, Clustered INDEX Scans=1, Clustered INDEX Seeks=2

    Mine: Cost=8%, Clustered INDEX Scans=1, Clustered INDEX Seeks=1

    Note that Cost cannot always be relied upon, just included for reference.

    IO Counts -

    Original: Scan count 38, logical reads 148, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    MickyT: Scan count 6, logical reads 84, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Mine: Scan count 2, logical reads 4, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Edit1: Added MickyT to the comparison.

    Edit2: Slightly simplified my suggested solution (no impact to Execution Plan or IO stats).


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • mickyT (11/14/2013)


    I'm not sure if I am understanding the requirements for this correctly, but would this work?

    declare @cellid int = 1;

    select *

    from CellAxes c

    inner join (

    select cellid

    from (select axiscode, positiononaxis from CellAxes where cellid = @cellid) a

    inner join CellAxes b on a.axisCode = b.axisCode

    group by cellid having sum(case when a.positiononaxis = b.positiononaxis then 0 else 1 end) <= 1

    ) il on c.cellid = il.cellid

    Micky - Tried to include this in my comparison but got this:

    Each GROUP BY expression must contain at least one column that is not an outer reference.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Please ignore the error I reported for MickyT's solution. Copy/paste error.

    I'm going back to my previous post to include MickyT's results.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Laurence Neville (11/14/2013)


    Wow! That is pretty close. However, what I needed was this result set:

    +----------+---------+

    | AxisCode | CellID |

    +----------+---------+

    | X | 1 |

    | X | 2 |

    | Y | 1 |

    | Y | 3 |

    | Z | 1 |

    | Z | 5 |

    | Z | 9 |

    +----------+---------+

    i.e. "if you walk the X dimension you get cells 1 & 2; walk Y you get 1 & 3; walk Z you get 1, 5 & 9. I do actually need to get cell 1 in each dimension.

    I've been trying to get my result to look like his while allowing for extra dimensions, time to get my vpn going so I can work on it over the weekend. My solution should work over additional dimensions (as far as cellids go), but I'm having problems getting it to output in the format above.

  • mickeyT - your solution returns too many rows unfortunately.

    dwain.c - your solution returns the right rows for 3 dimensions (X,Y,Z) and the query plan is by far the best, but it is hard-coded to assume those specific dimensions and I need a generic N-dimensional solution.

    LutzM - your 2nd query returns the right rows for cell ID 1 {1,1,1}. However for other cell IDs it returns the wrong rows.

    Thanks for everyone's contributions. For now the best solution is:

    declare @CellID int = 3

    ;with SelectedCell as

    (

    select CellID, AxisCode, PositionOnAxis

    from CellAxes

    where CellID = @CellID

    )

    select x.AxisCode, a2.CellID

    from SelectedCell a1

    join CellAxes a2 on a2.AxisCode = a1.AxisCode

    join CellAxes x on x.CellID = a1.CellID

    where (a1.AxisCode = x.AxisCode or a1.PositionOnAxis = a2.PositionOnAxis)

    group by x.AxisCode, a2.CellID

    having count(*) = (select count(AxisCode) from SelectedCell)

    order by x.AxisCode, a2.CellID

  • Laurence Neville (11/15/2013)


    dwain.c - your solution returns the right rows for 3 dimensions (X,Y,Z) and the query plan is by far the best, but it is hard-coded to assume those specific dimensions and I need a generic N-dimensional solution.

    I suppose next you'll tell me I can't use dynamic SQL.

    DECLARE @SQL NVARCHAR(MAX)

    ,@SQLParms NVARCHAR(MAX) = N'@Cellid INT'

    ,@CellID INT = 1;

    WITH Axes AS

    (

    SELECT DISTINCT AxisCode

    FROM CellAxes

    )

    ,CountAxes AS

    (

    SELECT AxesCount=COUNT(*)

    FROM Axes

    )

    SELECT @SQL = N'WITH RowsToAxes AS

    (

    SELECT CellID ' + CHAR(10) +

    (

    SELECT (

    SELECT ',' + AxisCode + '=MAX(CASE WHEN AxisCode = ''' + AxisCode + ''' THEN PositionOnAxis END)' + CHAR(10)

    FROM Axes

    FOR XML PATH(''))

    ) + CHAR(10) +

    N' FROM CellAxes

    GROUP BY CellID

    )

    SELECT b.AxisCode, b.CellID

    FROM

    (

    SELECT CellID' + CHAR(10) +

    (

    SELECT

    (

    SELECT ',a.' + AxisCode

    FROM Axes

    FOR XML PATH('')

    )

    ) + CHAR(10) +

    (

    SELECT

    (

    SELECT ',' + AxisCode + '1'

    FROM Axes

    FOR XML PATH('')

    )

    ) + CHAR(10) +',C=' +

    (

    SELECT STUFF(

    (

    SELECT '+' + AxisCode + '1'

    FROM Axes

    FOR XML PATH('')

    ),1,1,'')

    ) + CHAR(10) +

    ' FROM RowsToAxes a

    CROSS APPLY

    (

    SELECT ' + CHAR(10) +

    (

    SELECT STUFF(

    (

    SELECT ',' + AxisCode + '1=CASE ' + AxisCode + ' WHEN a.' + AxisCode + ' THEN 1 ELSE 0 END' + CHAR(10)

    FROM Axes

    FOR XML PATH('')

    ),1,1,'')

    ) + CHAR(10) +

    ' FROM RowsToAxes

    WHERE CellID = @CellID

    ) b

    ) a

    CROSS APPLY

    (

    VALUES' + CHAR(10) +

    (

    SELECT STUFF(

    (

    SELECT ',(CASE C WHEN ' + CAST((SELECT AxesCount FROM CountAxes) AS VARCHAR(5)) +

    ' THEN CellID ELSE (1-' + AxisCode + '1)*Cellid END, ''' + AxisCode + ''')' + CHAR(10)

    FROM Axes

    FOR XML PATH('')

    ),1,1,'')

    ) + CHAR(10) +

    ') b (CellID, AxisCode)

    WHERE C IN (' + CAST((SELECT AxesCount-1 FROM CountAxes) AS VARCHAR(5)) +

    ',' + CAST((SELECT AxesCount FROM CountAxes) AS VARCHAR(5)) +

    ') AND b.CellID <> 0;';

    PRINT @SQL

    EXEC sp_executesql @SQL, @SQLParms, @CellID = @CellID


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • OK , so I think I have got there, but it has made my query real ugly

    declare @cellid int = 1;

    select axiscode, cellid

    from (

    select c.cellid,

    case when c.cellid = @cellid then c.axiscode else

    (select c.axiscode

    from #CellAxes x

    where c.axiscode = x.axiscode and x.cellid = @cellid and c.positiononaxis <> x.positiononaxis)

    end axiscode

    from #CellAxes c

    inner join (

    select cellid

    from (select axiscode, positiononaxis from #CellAxes where cellid = @cellid) a

    inner join #CellAxes b on a.axisCode = b.axisCode

    group by cellid having sum(case when a.positiononaxis = b.positiononaxis then 0 else 1 end) <= 1

    ) il on c.cellid = il.cellid

    ) z

    where axiscode is not null

    order by axiscode

  • dwain.c - nope, I have nothing against dynamic SQL!

    I have to award the prize to mickeyT's solution - it is the fastest of all the contributions.

    Thanks everyone ... I was really banging my head with this one, and when I posted on StackOverflow most responses were "it can't be done in SQL". I guess I was asking in the wrong place!

  • Laurence Neville (11/18/2013)


    dwain.c - nope, I have nothing against dynamic SQL!

    I have to award the prize to mickeyT's solution - it is the fastest of all the contributions.

    Thanks everyone ... I was really banging my head with this one, and when I posted on StackOverflow most responses were "it can't be done in SQL". I guess I was asking in the wrong place!

    Even though you chose MickyT's solution over mine (congratulations Micky!), I am still gratified whenever I hear someone say "it can't be done in SQL" and yet here are multiple solutions showing how it can.

    At least you now know where to come next time.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

Viewing 15 posts - 1 through 15 (of 23 total)

You must be logged in to reply to this topic. Login to reply