Building a tree view using t-sql

  • This one has got me stumped, I have a solution ready but I am sure it can be done a lot better. The good news is that there wont be more than 5 sub levels , the bad news is seq is not the seq within the nested level but the sequence within the entire tree.

    create table #tree

    (

    level int ,

    seq int ,

    value varchar(200)

    )

    insert into #tree

    select 2,2,'Earth' union all

    select 3,3,'Europe' union all

    select 4,4,'France' union all

    select 4,5,'Germany' union all

    select 5,6,'Munich' union all

    select 5,7,'Frankfurt' union all

    select 6,8,'Ziel' union all

    select 6,9,'Schaumainkai' union all

    select 3,10,'Asia' union all

    select 3,11,'North America' union all

    select 4,12,'Canada' union all

    select 3,13,'South America' union all

    select 2,14,'Movies' union all

    select 3,15,'Action' union all

    select 4,16,'English' union all

    select 3,17,'Comedy' union all

    select 2,18,'Games' union all

    select 3,19,'Team Sports' union all

    select 3,20,'Individual '

    select * from #tree

    The desired output is supposed to look like

    PositionResult

    1 Earth

    1.1 Europe

    1.1.1 France

    1.1.2 Germany

    1.1.2.1 Munich

    1.1.2.2 Frankfurt

    1.1.2.2.1 Ziel

    1.1.2.2.2 Schaumainkai

    1.2 Asia

    1.3 North America

    1.3.1 Canada

    1.4 South America

    2 Movies

    2.1 Action

    2.1.1 English

    2.2 Comedy

    3 Games

    3.1 Team Sports

    3.2 Individual

    Any suggestions would be helpful.

    Jayanth Kurup[/url]

  • /*

    DROP TABLE #TREE

    create table #tree (Position VARCHAR(10), level int, seq int, value varchar(200))

    insert into #tree

    select '1', 2,2,'Earth' union all

    select '1.1', 3,3,'Europe' union all

    select '1.1.1', 4,4,'France' union all

    select '1.1.2', 4,5,'Germany' union all

    select '1.1.2.1', 5,6,'Munich' union all

    select '1.1.2.2', 5,7,'Frankfurt' union all

    select '1.1.2.2.1', 6,8,'Ziel' union all

    select '1.1.2.2.2', 6,9,'Schaumainkai' union all

    select '1.2', 3,10,'Asia' union all

    select '1.3', 3,11,'North America' union all

    select '1.3.1', 4,12,'Canada' union all

    select '1.4', 3,13,'South America' union all

    select '2', 2,14,'Movies' union all

    select '2.1', 3,15,'Action' union all

    select '2.1.1', 4,16,'English' union all

    select '2.2', 3,17,'Comedy' union all

    select '3', 2,18,'Games' union all

    select '3.1', 3,19,'Team Sports' union all

    select '3.2', 3,20,'Individual '

    */

    ;WITH Resolver AS (

    SELECT Position, level, seq, value,

    Pos1 = CAST(1 AS TINYINT),

    Pos2 = CAST(NULL AS TINYINT),

    Pos3 = CAST(NULL AS TINYINT),

    Pos4 = CAST(NULL AS TINYINT),

    Pos5 = CAST(NULL AS TINYINT)

    FROM #tree

    WHERE seq = 2

    UNION ALL

    SELECT tr.Position, tr.level, tr.seq, tr.value,

    Pos1 = CAST(CASE WHEN tr.level = 2 THEN lr.Pos1+1 ELSE lr.Pos1 END AS TINYINT),

    Pos2 = CAST(CASE

    WHEN tr.level < 3 THEN NULL

    WHEN tr.level = 3 AND lr.level = 2 THEN 1

    WHEN tr.level = 3 AND lr.level > 2 THEN lr.Pos2+1

    ELSE lr.Pos2 END AS TINYINT),

    Pos3 = CAST(CASE

    WHEN tr.level < 4 THEN NULL

    WHEN tr.level = 4 AND lr.level = 3 THEN 1

    WHEN tr.level = 4 AND lr.level > 3 THEN lr.Pos3+1

    ELSE lr.Pos3 END AS TINYINT),

    Pos4 = CAST(CASE

    WHEN tr.level < 5 THEN NULL

    WHEN tr.level = 5 AND lr.level = 4 THEN 1

    WHEN tr.level = 5 AND lr.level > 4 THEN lr.Pos4+1

    ELSE lr.Pos4 END AS TINYINT),

    Pos5 = CAST(CASE

    WHEN tr.level < 6 THEN NULL

    WHEN tr.level = 6 AND lr.level = 5 THEN 1

    WHEN tr.level = 6 AND lr.level > 5 THEN lr.Pos5+1

    ELSE lr.Pos5 END AS TINYINT)

    FROM Resolver lr

    INNER JOIN #tree tr

    ON tr.seq = lr.seq+1

    )

    SELECT *

    FROM Resolver

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Sweet !!!

    It never occurred to me to use a case statement within the Recr CTE

    Thanks Chris.

    Jayanth Kurup[/url]

  • I used a much different tactic. I'll explain why in a minute but let's first see how it solves your immediate problem.

    First, here's your original table and data (keeping it all together for those that want to play along).

    --===== If the test table already exists, drop it to make reruns easier.

    IF OBJECT_ID('tempdb..#Tree') IS NOT NULL DROP TABLE #Tree

    ;

    GO

    --===== Create the test table

    CREATE TABLE #Tree

    (

    [Level] INT NOT NULL,

    Seq INT NOT NULL, --child

    Value VARCHAR(200) NOT NULL

    )

    ;

    --===== Populate the test table with the initial data

    INSERT INTO #Tree

    ([Level],Seq,Value)

    SELECT 2, 2,'Earth' UNION ALL

    SELECT 3, 3,'Europe' UNION ALL

    SELECT 4, 4,'France' UNION ALL

    SELECT 4, 5,'Germany' UNION ALL

    SELECT 5, 6,'Munich' UNION ALL

    SELECT 5, 7,'Frankfurt' UNION ALL

    SELECT 6, 8,'Ziel' UNION ALL

    SELECT 6, 9,'Schaumainkai' UNION ALL

    SELECT 3,10,'Asia' UNION ALL

    SELECT 3,11,'North America' UNION ALL

    SELECT 4,12,'Canada' UNION ALL

    SELECT 3,13,'South America' UNION ALL

    SELECT 2,14,'Movies' UNION ALL

    SELECT 3,15,'Action' UNION ALL

    SELECT 4,16,'English' UNION ALL

    SELECT 3,17,'Comedy' UNION ALL

    SELECT 2,18,'Games' UNION ALL

    SELECT 3,19,'Team Sports' UNION ALL

    SELECT 3,20,'Individual'

    ;

    What we're going to do is convert that to an Adjacency List to start with and before it all get's out of hand for you (just add the United States to the North America node and you'll see the mess that you have to clean up).

    The first thing that we need and that you should have a copy of in virtually every user database that you have is an fnTally function. Here's mine.

    CREATE FUNCTION [dbo].[fnTally]

    /**********************************************************************************************************************

    Purpose:

    Return a column of BIGINTs from @ZeroOrOne up to and including @MaxN with a max value of 1 Trillion.

    As a performance note, it takes about 00:02:10 (hh:mm:ss) to generate 1 Billion numbers to a throw-away variable.

    Usage:

    --===== Syntax example (Returns BIGINT)

    SELECT t.N

    FROM dbo.fnTally(@ZeroOrOne,@MaxN) t

    ;

    Notes:

    1. Based on Itzik Ben-Gan's cascading CTE (cCTE) method for creating a "readless" Tally Table source of BIGINTs.

    Refer to the following URLs for how it works and introduction for how it replaces certain loops.

    http://www.sqlservercentral.com/articles/T-SQL/62867/

    http://sqlmag.com/sql-server/virtual-auxiliary-table-numbers

    2. To start a sequence at 0, @ZeroOrOne must be 0 or NULL. Any other value that's convertable to the BIT data-type

    will cause the sequence to start at 1.

    3. If @ZeroOrOne = 1 and @MaxN = 0, no rows will be returned.

    5. If @MaxN is negative or NULL, a "TOP" error will be returned.

    6. @MaxN must be a positive number from >= the value of @ZeroOrOne up to and including 1 Billion. If a larger

    number is used, the function will silently truncate after 1 Billion. If you actually need a sequence with

    that many values, you should consider using a different tool. ;-)

    7. There will be a substantial reduction in performance if "N" is sorted in descending order. If a descending

    sort is required, use code similar to the following. Performance will decrease by about 27% but it's still

    very fast especially compared with just doing a simple descending sort on "N", which is about 20 times slower.

    If @ZeroOrOne is a 0, in this case, remove the "+1" from the code.

    DECLARE @MaxN BIGINT;

    SELECT @MaxN = 1000;

    SELECT DescendingN = @MaxN-N+1

    FROM dbo.fnTally(1,@MaxN);

    8. There is no performance penalty for sorting "N" in ascending order because the output is explicity sorted by

    ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    Revision History:

    Rev 00 - Unknown - Jeff Moden

    - Initial creation with error handling for @MaxN.

    Rev 01 - 09 Feb 2013 - Jeff Moden

    - Modified to start at 0 or 1.

    Rev 02 - 16 May 2013 - Jeff Moden

    - Removed error handling for @MaxN because of exceptional cases.

    Rev 03 - 22 Apr 2015 - Jeff Moden

    - Modify to handle 1 Trillion rows for experimental purposes.

    **********************************************************************************************************************/

    (@ZeroOrOne BIT, @MaxN BIGINT)

    RETURNS TABLE WITH SCHEMABINDING AS

    RETURN WITH

    E1(N) AS (SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1) --10E1 or 10 rows

    , E4(N) AS (SELECT 1 FROM E1 a, E1 b, E1 c, E1 d) --10E4 or 10 Thousand rows

    ,E12(N) AS (SELECT 1 FROM E4 a, E4 b, E4 c) --10E12 or 1 Trillion rows

    SELECT N = 0 WHERE ISNULL(@ZeroOrOne,0)= 0 --Conditionally start at 0.

    UNION ALL

    SELECT TOP(@MaxN) N = ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E12 -- Values from 1 to @MaxN

    ;

    Up next, we need to add a couple of things to your table to convert it to an Adjacency List and then do the actual conversion to an Adjacency list. As usual, details are in the comments in the code.

    --===== Modify the table to set it up as an Adjacency List.

    ALTER TABLE #Tree

    ADD ParentSeq INT NULL

    ;

    --===== Add the top level "forest" row to accomodate all trees in the forest.

    -- Since your list started with Seq = 2, I can only imagine that you

    -- that you actually have a top level "forest" node.

    INSERT INTO #Tree

    ([Level],Seq,Value)

    SELECT 1, 1,'Universe'

    ;

    --===== Add the unique clustered index we'll need for performance

    -- and to ensure it has the minimum requirement of an Adjacency List,

    -- which is "unique children".

    ALTER TABLE #Tree

    ADD PRIMARY KEY CLUSTERED (SEQ)

    ;

    --===== Determine the Parent Seq for each row according to the order of SEQ,

    -- which is what is needed to convert this to an Adjacency List

    WITH cteLevels AS (SELECT N FROM dbo.fnTally(1,(SELECT MAX([Level]) FROM #Tree)))

    UPDATE tgt

    SET ParentSeq = ca.Seq

    FROM #tree tgt

    JOIN cteLevels l ON tgt.[Level] = l.N+1

    CROSS APPLY (SELECT TOP 1 seq

    FROM #tree src

    WHERE src.seq < tgt.seq

    AND src.[Level] = l.N

    ORDER BY src.seq DESC) ca (Seq)

    ;

    Last but not least, let's solve your immediate problem (which will also be a future problem when it comes to maintenance because of the current form of your data). To be honest, a properly written While Loop would beat this rCTE but it's still pretty fast because each iteration resolves an entire level rather than just one row at a time.

    --===== Produce the desired "legal-format" list for Position.

    WITH cteLegalList AS

    (

    SELECT Position = CAST(ROW_NUMBER() OVER (ORDER BY Seq) AS VARCHAR(1000))

    ,*

    FROM #Tree

    WHERE [Level] = 2

    UNION ALL

    SELECT Position = CAST(cte.Position + '.' + CAST(ROW_NUMBER() OVER (ORDER BY tbl.Seq) AS VARCHAR(10)) AS VARCHAR(1000))

    ,[Level] = cte.[Level] + 1

    ,tbl.Seq

    ,tbl.Value

    ,tbl.ParentSeq

    FROM cteLegalList cte

    JOIN #Tree tbl ON cte.Seq = tbl.ParentSeq

    )

    SELECT * --Include only the columns you want. I used "*" just so you could see everything in the table

    FROM cteLegalList

    ORDER BY Seq

    OPTION (MAXRECURSION 120) --Safety valve

    ;

    Here are the results...

    Position Level Seq Value

    ---------- ----------- ----------- --------------

    1 2 2 Earth

    1.1 3 3 Europe

    1.1.1 4 4 France

    1.1.2 4 5 Germany

    1.1.2.1 5 6 Munich

    1.1.2.2 5 7 Frankfurt

    1.1.2.2.1 6 8 Ziel

    1.1.2.2.2 6 9 Schaumainkai

    1.2 3 10 Asia

    1.3 3 11 North America

    1.3.1 4 12 Canada

    1.4 3 13 South America

    2 2 14 Movies

    2.1 3 15 Action

    2.1.1 4 16 English

    2.2 3 17 Comedy

    3 2 18 Games

    3.1 3 19 Team Sports

    3.2 3 20 Individual

    (19 row(s) affected)

    Ok... so why convert it to an Adjacency List (and I DO recommend that you convert it to a permanent table)?

    First and like I said, add just one node to North America in your original data and realize that you must update the Seq values for all values below that. A true Adjacency List would have no such problem if you do a little trick to calculate a separate sequence instead of using Seq as the (child) ID for each row.

    Second, there's so much more you can do with an Adjacency List. Here are a couple of articles to show you the possibilities and to make maintenance so much easier than what you currently have.

    http://www.sqlservercentral.com/articles/Hierarchy/94040/

    http://www.sqlservercentral.com/articles/T-SQL/94570/

    Third, the code I used is good to over 111 levels. Unless you're running an MLM, there's a pretty good chance that you won't go past that. If you follow the techniques in the articles above, you can easily get 2000 levels and code that runs much faster for a whole lot more than creating an eventually unsortable list like you have with the current data.

    Any suggestions would be helpful

    As a bit of a side bar, I'd lose the positional notation because, by itself, it's not going to be sortable once a level goes over 9 nodes not to mention the fact that it's going to be renumbered as soon as you add or delete even just one node. If you keep that notation, you will be required to manually update the Seq column of your original table. If you learn to use an Adjacency List and the Nested Sets in the first article I directed you to, life will be much easier for you because it will maintain a "sort" column for you auto-magically.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply