Loading Hierarchical data

  • Hi Experts,

    CREATE TABLE #Source

    (

    Id int identity(1,1)

    ,categoryint

    ,Leaf_Node_code varchar(10) --

    ,Level1_Name varchar(20)

    ,Level2_Name varchar(20)

    ,Level3_Name varchar(20)

    ,Level4_Name varchar(20)

    ,Level5_Name varchar(20)

    )

    INSERT INTO #Source

    SELECT 1,'101','World','Asia','India',null , null UNION ALL

    SELECT 2,'102','a','aa','aaa','aaaa' , null UNION ALL

    SELECT 3,'103','b','bb','bbb','bbbb','bbbbb'

    Here category 1 has 3 levels ,

    category 2 has 4 levels ,

    category 3 has 5 levels ,

    below is the target table, here Leaf_Node_code should populate to only for leaf nodes for each category ..

    Need to populate Node_id with hierarchical data

    I am unable frame a sql query to handle different levels , in future #Source may have more levels .

    Please guide me on handle multiple hierarchy levels .. here only leaf node should have Leaf_Node_code

    CREATE TABLE TARGET_TABLE

    (

    ID INT IDENTITY(1,1) primary key

    ,Node_id HIERARCHYID

    ,category int

    ,Parent_id int references TARGET_TABLE(id)

    ,Leaf_Node_code varchar(10)

    ,Namevarchar(20)

    )

    Here is the expected output:

    IDcategoryParent_idLeaf_Node_codeName Node_id

    11NULLNULLWorld

    211NULLAsia

    312101India

    42NULLNULLa

    524NULLaa

    625NULLaaa

    726102aaaa

    83NULLNULLb

    938NULLbb

    1039NULLbbb

    11310NULLbbbb

    12311103bbbb

    Thanks.

  • First, let me say "thank you" for the excellent explanation, the readily consumable sample data, and the expected results. Well done!

    Here's one solution to the problem (was a lot of fun, BTW... thanks for the interesting problem). As always, details are in the code. I did change on column name in the output

    WITH

    cteUnpivot AS

    ( --=== This cte does the unpivoting and everything except assignment of the Parent_ID.

    SELECT ID = ROW_NUMBER() OVER (ORDER BY s.Category, ca.Lvl),

    s.Category,

    Leaf_Node_Code = --This numbers the rows "backwards" for each category and then assigns the

    --Leaf_Node_Code to the "lowest" level available which will have a "1" because

    --of the backwards numbering.

    CASE ROW_NUMBER() OVER (PARTITION BY s.Category ORDER BY s.Category, ca.Lvl DESC)

    WHEN 1 THEN Leaf_Node_code

    ELSE NULL

    END,

    ca.Lvl, --We need this in the SELECT after the CTE to assign NULL to the Parent_ID

    --of the top level of each category.

    ca.[Name Node_ID]

    FROM #Source s

    CROSS APPLY --This CROSS APPLY does the "unpivot" of levels for each row.

    ( --== This outer query eliminates level row with nulls in them.

    SELECT Lvl, [Name Node_ID]

    FROM ( --== This unpivots the level names and assigns a processing order (Lvl).

    SELECT 1, Level1_Name UNION ALL

    SELECT 2, Level2_Name UNION ALL

    SELECT 3, Level3_Name UNION ALL

    SELECT 4, Level4_Name UNION ALL

    SELECT 5, Level5_Name

    ) unpvt (Lvl, [Name Node_ID])

    WHERE [Name Node_ID] > '' --Not Blank and Not Null

    ) ca (Lvl, [Name Node_ID])

    )

    SELECT ID,

    Category,

    Parent_ID = --Assigns NULL to the Parent_ID of the "top" each category according to level

    --and simply subtracts 1 from the ID to come up with the Parent_ID for the rest.

    CASE Lvl

    WHEN 1 THEN NULL

    ELSE ID - 1

    END,

    Leaf_Node_Code,

    [Name Node_ID]

    FROM cteUnpivot

    ORDER BY ID

    ;

    Here are the results from the code above...

    ID Category Parent_ID Leaf_Node_Code Name Node_ID

    -- --------- --------- -------------- ------------

    1 1 NULL NULL World

    2 1 1 NULL Asia

    3 1 2 101 India

    4 2 NULL NULL a

    5 2 4 NULL aa

    6 2 5 NULL aaa

    7 2 6 102 aaaa

    8 3 NULL NULL b

    9 3 8 NULL bb

    10 3 9 NULL bbb

    11 3 10 NULL bbbb

    12 3 11 103 bbbbb

    (12 row(s) affected)

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks for you query, but it is not giving the correct Parent_ID if I add more rows under category 2(or any) to the temp table ..

    SELECT 2,'104','c','cc','ccc','cccc' , null

    I am also thank full to you if the query returns Node_id column with hierarchical id values to handle levels.

    as per my requirement I have added the Leaf_Node_code PARTITION BY clause.

    ROW_NUMBER() OVER (PARTITION BY s.Category,Leaf_Node_code ORDER BY s.Category,ca.Lvl DESC)

  • Nagaram (8/4/2013)


    Thanks for you query, but it is not giving the correct Parent_ID if I add more rows under category 2(or any) to the temp table ..

    SELECT 2,'104','c','cc','ccc','cccc' , null

    I am also thank full to you if the query returns Node_id column with hierarchical id values to handle levels.

    as per my requirement I have added the Leaf_Node_code PARTITION BY clause.

    ROW_NUMBER() OVER (PARTITION BY s.Category,Leaf_Node_code ORDER BY s.Category,ca.Lvl DESC)

    Please... no partial code. I don't want to have to guess where you added code. Also and if you would please, provide what the entire expected results would be when you add rows to the data so there's no guess-work on my part. I work a lot of posts and, as a result, remembering what is what from post to post and assimilating partial code/data is an impossibility for me. Thanks.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks for the help .. Here I have added few more records to the temp table

    CREATE TABLE #Source

    (

    Id int identity(1,1)

    ,categoryint

    ,Leaf_Node_code varchar(10) --

    ,Level1_Name varchar(20)

    ,Level2_Name varchar(20)

    ,Level3_Name varchar(20)

    ,Level4_Name varchar(20)

    ,Level5_Name varchar(20)

    )

    INSERT INTO #Source

    SELECT 1,'101','World','Asia','India',null , null UNION ALL

    SELECT 2,'102','a','aa','aaa','aaaa' , null UNION ALL

    SELECT 2,'103','a','aa','aaa','bbbb' , null UNION ALL

    SELECT 2,'104','a','aa','aaa','cccc' , null UNION ALL

    SELECT 2,'105','d','dd','ddd','dddd' , null UNION ALL

    SELECT 1,'107','World','Asia','china',null , null

    expected output:

    IDCategoryLeaf_Node_CodeLvlName Node_IDParent_IDNode_id.ToString()

    11NULL1WorldNULL/1/

    21NULL2Asia1/1/1/

    311013India2/1/1/1

    42NULL1aNULL/2/

    52NULL1dNULL/3/

    72NULL2aa4/2/1/

    62NULL2dd5/3/1/

    82NULL3aaa7/2/1/1/

    92NULL3ddd6/3/1/1/

    1021054dddd9/3/1/1/1/

    1121024aaaa8/2/1/1/1/

    1221034bbbb8/2/1/1/2/

    1321034cccc8/2/1/1/3/

    1411073china 2/1/1/2/

    Here aaaa,bbbb and cccc are Level4 names under aaa(level-3 name)

    and dddd is level4-name under ddd(level-3 name)

    and china and india are level-3 names under Asia(Level2 name)

  • Understood. Thanks for the update especially on the desired results. I have to prep for work tomorrow so I can't guarantee that I can get to this tonight but I do have two ideas and will try to get to it. It's a fun problem.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks

  • This is one ugly looking query:w00t:, but it should perform OK.

    I've created node ids using dense rank across each of the levels and adding the max rank of the previous levels to them. This way I have the IDs before I unpivot the data with a cross apply to a values statement.

    I've also used another set of dense ranks to create internal sequences to build the hierarchy paths the way indicated

    SELECT

    b.ID,

    a.Category,

    b.Leaf_Node_Code,

    b.Lvl,

    b.[Name Node_ID],

    b.Parent_ID,

    b.Node_ID_String

    FROM (

    SELECT s.*,

    -- Node IDs (Will have some gaps in the sequence)

    N1 = CASE WHEN s.Level1_Name IS NOT NULL THEN DENSE_RANK() OVER (ORDER BY s.Category, s.Level1_Name) ELSE NULL END , -- First Level Node ID

    N2 = M1 + CASE WHEN s.Level2_Name IS NOT NULL THEN DENSE_RANK() OVER (ORDER BY s.Category, s.Level1_Name, s.Level2_Name) ELSE NULL END ,-- Second Level Node ID

    N3 = M1 + M2 + CASE WHEN s.Level3_Name IS NOT NULL THEN DENSE_RANK() OVER (ORDER BY s.Category, s.Level1_Name, s.Level2_Name, s.Level3_Name) ELSE NULL END ,-- Third Level Node ID

    N4 = M1 + M2 + M3 + CASE WHEN s.Level4_Name IS NOT NULL THEN DENSE_RANK() OVER (ORDER BY s.Category, s.Level1_Name, s.Level2_Name, s.Level3_Name, s.Level4_Name) ELSE NULL END ,-- Fourth Level Node ID

    N5 = M1 + M2 + M3 + M4 + CASE WHEN s.Level5_Name IS NOT NULL THEN DENSE_RANK() OVER (ORDER BY s.Category, s.Level1_Name, s.Level2_Name, s.Level3_Name, s.Level4_Name, s.Level5_Name) ELSE NULL END, -- Fifth Level Node ID

    -- Node Heirarchy IDs (The Ns could be used)

    I1 = DENSE_RANK() OVER (ORDER BY s.Category, s.Level1_Name) , -- First Level Node ID

    I2 = DENSE_RANK() OVER (PARTITION BY s.Category, s.Level1_Name ORDER BY s.Level2_Name) ,-- Second Level Node ID

    I3 = DENSE_RANK() OVER (PARTITION BY s.Category, s.Level1_Name, s.Level2_Name ORDER BY s.Level3_Name) ,-- Third Level Node ID

    I4 = DENSE_RANK() OVER (PARTITION BY s.Category, s.Level1_Name, s.Level2_Name, s.Level3_Name ORDER BY s.Level4_Name) ,-- Fourth Level Node ID

    I5 = DENSE_RANK() OVER (PARTITION BY s.Category, s.Level1_Name, s.Level2_Name, s.Level3_Name, s.Level4_Name ORDER BY s.Level5_Name) -- Fifth Level Node ID

    FROM #source AS s

    CROSS APPLY (

    -- Used to create IDs

    SELECT MAX(N1) M1, MAX(N2) M2,MAX(N3) M3,MAX(N4) M4,MAX(N5) M5

    FROM (

    SELECT

    N1 = CASE WHEN s.Level1_Name IS NOT NULL THEN DENSE_RANK() OVER (ORDER BY s.Category, s.Level1_Name) ELSE NULL END , -- First Level Node ID

    N2 = CASE WHEN s.Level2_Name IS NOT NULL THEN DENSE_RANK() OVER (ORDER BY s.Category, s.Level1_Name, s.Level2_Name) ELSE NULL END ,-- Second Level Node ID

    N3 = CASE WHEN s.Level3_Name IS NOT NULL THEN DENSE_RANK() OVER (ORDER BY s.Category, s.Level1_Name, s.Level2_Name, s.Level3_Name) ELSE NULL END ,-- Third Level Node ID

    N4 = CASE WHEN s.Level4_Name IS NOT NULL THEN DENSE_RANK() OVER (ORDER BY s.Category, s.Level1_Name, s.Level2_Name, s.Level3_Name, s.Level4_Name) ELSE NULL END ,-- Fourth Level Node ID

    N5 = CASE WHEN s.Level5_Name IS NOT NULL THEN DENSE_RANK() OVER (ORDER BY s.Category, s.Level1_Name, s.Level2_Name, s.Level3_Name, s.Level4_Name, s.Level5_Name) ELSE NULL END -- Fifth Level Node ID

    FROM #source AS s

    ) AS x

    ) AS y

    ) as a

    CROSS APPLY (

    --Unpivot the data and build heirarchy string, etc

    VALUES

    --Level 1

    (1, --Level Number

    a.N1, -- Node ID

    a.Level1_Name, -- Name Value

    '/' + CAST(a.I1 AS VARCHAR(10)), -- Build Node Heirarchy

    CASE WHEN a.Level2_Name IS NULL THEN a.leaf_node_code ELSE null END, -- Leaf_Node_Code if last populated level name

    NULL -- Parent

    ),

    --Level 2

    (2, --Level Number

    a.N2, -- Node ID

    a.Level2_Name, -- Name Value

    CASE WHEN a.Level2_Name IS NOT NULL THEN '/' + CAST(a.I1 AS VARCHAR(10)) + '/' + CAST(a.I2 AS VARCHAR(10)) ELSE NULL END, -- Build Node Heirarchy

    CASE WHEN a.Level3_Name IS NULL THEN a.leaf_node_code ELSE null END, -- Leaf_Node_Code if last populated level name

    a.N1 -- Parent

    ),

    --Level 3

    (3, --Level Number

    a.N3, -- Node ID

    a.Level3_Name, -- Name Value

    CASE WHEN a.Level3_Name IS NOT NULL THEN '/' + CAST(a.I1 AS VARCHAR(10)) + '/' + CAST(a.I2 AS VARCHAR(10)) + '/' + CAST(a.I3 AS VARCHAR(10)) ELSE NULL END, -- Build Node Heirarchy

    CASE WHEN a.Level4_Name IS NULL THEN a.leaf_node_code ELSE null END, -- Leaf_Node_Code if last populated level name

    a.N2 -- Parent

    ),

    --Level 4

    (4, --Level Number

    a.N4, -- Node ID

    a.Level4_Name, -- Name Value

    CASE WHEN a.Level4_Name IS NOT NULL THEN '/' + CAST(a.I1 AS VARCHAR(10)) + '/' + CAST(a.I2 AS VARCHAR(10)) + '/' + CAST(a.I3 AS VARCHAR(10)) + '/' + CAST(a.I4 AS VARCHAR(10)) ELSE NULL END, -- Build Node Heirarchy

    CASE WHEN a.Level5_Name IS NULL THEN a.leaf_node_code ELSE null END, -- Leaf_Node_Code if last populated level name

    a.N3 -- Parent

    ),

    --Level 5

    (5, --Level Number

    a.N5, -- Node ID

    a.Level5_Name, -- Name Value

    CASE WHEN a.Level5_Name IS NOT NULL THEN '/' + CAST(a.I1 AS VARCHAR(10)) + '/' + CAST(a.I2 AS VARCHAR(10)) + '/' + CAST(a.I3 AS VARCHAR(10)) + '/' + CAST(a.I4 AS VARCHAR(10)) + '/' + CAST(a.I5 AS VARCHAR(10)) ELSE NULL END,

    a.leaf_node_code, -- Leaf_Node_Code

    a.N4 -- Parent

    )

    ) AS b (Lvl, ID, [Name Node_ID], Node_ID_String, Leaf_Node_Code, Parent_ID)

    WHERE b.[Name Node_ID] is not null -- Filter out empty records

    GROUP BY b.ID, a.Category, b.Lvl, b.[Name Node_ID], b.Node_ID_String, b.Leaf_Node_Code, b.Parent_ID -- Make Distinct

    ORDER BY Node_ID_String

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply