Representing a hierarchy from comma separated values

  • Hi

    I would like to populate a bill of materials (#bom) type table from another table (#tree)

    that has an attribute which represents a hierarchy as comma separated values.

    Create table #tree

    (

    id int identity (1,1)

    ,tree varchar (100)

    )

    insert into #tree select 'grandson1,son1,grandfather1'

    insert into #tree select 'grandson2,son1,grandfather1'

    insert into #tree select 'son2,grandfather1'

    insert into #tree select 'grandson3,daughter1,grandfather1'

    insert into #tree select 'son1,grandfather2'

    insert into #tree select 'granddaughter1,daughter1,grandfather2'

    insert into #tree select 'grandson1,daughter1,grandfather2'

    insert into #tree select 'grandson2,son2,grandfather2'

    Create table #bom

    (

    id int identity (1,1)

    ,value varchar (1000)

    ,parent_id int

    ,child_id int

    )

    Can I do this with a set based solution?

    Thanks

  • Please clarify the interpretation of the "hierarchy".

    Example: 'grandson1,son1,grandfather1' can mean anything:

    Some scenarios:

    a)

    Parent: grandfather1

    Child1: son1

    Child2: grandson1

    b)

    Parent: grandfather1

    Child: son1

    Parent: son1

    Child: grandson1

    c)

    Parent: son1

    LeftChild: grandson1

    RightChild: grandfather1

    ...and a few more possible solutions...

    Side note: since there is no father or mother in between, it's not really a family tree...



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Hi Lutx

    Actually on second inspection I realised that I dont need the child_id in the #BOM table.

    What would be helpful in its place would be a level_id which

    represents the level in the hierarcy from the root.

    I think that the #BOM would be represented like this

    (sorry the editor changes the formatting of the table below)

    idvalueparent_id level_id

    1grandfather1 null 1

    2grandfather2 null 1

    3son112

    4son212

    5daughter112

    6son122

    7daughter122

    8son222

    9grandson133

    10grandson233

    11grandson353

    12granddaughter1 73

    13grandson173

    14grandson283

    Thanks

  • pYak (3/17/2010)


    Can I do this with a set based solution?

    The answer is... kind of. In SQL Server 2005 and below, you need some form of looping for Hierarchy resolution whether it be a Recursive CTE (which is actually a loop that can be worse than a cursor) or some form of While Loop. The key is to limit the number of loops and to do as much set-based processing in each loop that you can. The following code only loops 3 times because there are 3 levels in the data.

    As a side bar, the code does what you asked for... I don't believe it does what you need, though. The Hierarchy column should be ID's... not Values... that way, you could interogate the Hierarchy in many different ways. Let me know if that's what you really want to do.

    --===== If the temporary demonstration tables exist, drop them

    IF OBJECT_ID('TempDB..#Original') IS NOT NULL

    DROP TABLE #Original

    IF OBJECT_ID('TempDB..#Hierarchy') IS NOT NULL

    DROP TABLE #Hierarchy

    --===== Create the table to hold the data posted for test

    -- This is NOT a part of the solution. It's just to setup a test.

    CREATE TABLE #Original

    (

    ID INT PRIMARY KEY CLUSTERED,

    Value VARCHAR(20),

    Parent_ID INT

    )

    --===== Populate the table with the test data posted

    -- This is NOT a part of the solution. It's just to setup a test.

    INSERT INTO #Original (ID, Value, Parent_ID)

    SELECT '1','grandfather1',NULL UNION ALL

    SELECT '2','grandfather2',NULL UNION ALL

    SELECT '3','son1','1' UNION ALL

    SELECT '4','son2','1' UNION ALL

    SELECT '5','daughter1','1' UNION ALL

    SELECT '6','son1','2' UNION ALL

    SELECT '7','daughter1','2' UNION ALL

    SELECT '8','son2','2' UNION ALL

    SELECT '9','grandson1','3' UNION ALL

    SELECT '10','grandson2','3' UNION ALL

    SELECT '11','grandson3','5' UNION ALL

    SELECT '12','granddaughter1','7' UNION ALL

    SELECT '13','grandson1','7' UNION ALL

    SELECT '14','grandson2','8'

    --===== Test setup complete, we're ready to rock!

    --===== Create the Hierarchy table

    CREATE TABLE #Hierarchy

    (

    ID INT PRIMARY KEY,

    Parent_ID INT,

    Level INT,

    Hierarchy VARCHAR(8000)

    )

    --===== Seed the Hierarchy table with the top level

    INSERT INTO #Hierarchy

    (ID, Parent_ID, Level, Hierarchy)

    SELECT ID,

    Parent_ID,

    0 AS Level,

    VALUE AS Hierarchy

    FROM #Original

    WHERE Parent_ID IS NULL

    --===== Create and preset a level counter.

    DECLARE @CurrentLevel INT

    SET @CurrentLevel = 0

    --===== Determine the rest of the hierarchy

    -- The loop processes sets of rows based on level.

    -- Since there are only 3 levels, this makes 3 passes

    -- and the last pass is a "zero" work pass

    WHILE @@ROWCOUNT > 0

    BEGIN

    SET @CurrentLevel = @CurrentLevel + 1 --Started at 0

    INSERT INTO #Hierarchy

    (ID, Parent_ID, Level, Hierarchy)

    SELECT p.ID,

    p.Parent_ID,

    @CurrentLevel AS Level,

    h.Hierarchy + ',' + p.Value

    FROM #Original p

    INNER JOIN #Hierarchy h

    ON p.Parent_ID = h.ID

    AND h.Level = @CurrentLevel - 1

    END

    --===== Show the Hierarchy table

    SELECT * FROM #hierarchy

    ID Parent_ID Level Hierarchy

    ----------- ----------- ----------- -------------------------------------

    1 NULL 0 grandfather1

    2 NULL 0 grandfather2

    3 1 1 grandfather1,son1

    4 1 1 grandfather1,son2

    5 1 1 grandfather1,daughter1

    6 2 1 grandfather2,son1

    7 2 1 grandfather2,daughter1

    8 2 1 grandfather2,son2

    9 3 2 grandfather1,son1,grandson1

    10 3 2 grandfather1,son1,grandson2

    11 5 2 grandfather1,daughter1,grandson3

    12 7 2 grandfather2,daughter1,granddaughter1

    13 7 2 grandfather2,daughter1,grandson1

    14 8 2 grandfather2,son2,grandson2

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Did that help or what?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hi Jeff

    Thanks for the reply. I think my problem is a bit different

    than the solution you provided in that the id of the parent

    for the child in #tree is not known ie the csv (tree)

    implies that relation. I can tear the csv apart and order it into

    a relation like:

    id level val treeval

    11grandfather1grandson1,son1,grandfather1

    12son1grandson1,son1,grandfather1

    13grandson1grandson1,son1,grandfather1

    21grandfather1grandson2,son1,grandfather1

    22son1grandson2,son1,grandfather1

    23grandson2grandson2,son1,grandfather1

    31grandfather1son2,grandfather1

    32son2son2,grandfather1

    41grandfather1grandson3,daughter1,grandfather1

    42daughter1grandson3,daughter1,grandfather1

    43grandson3grandson3,daughter1,grandfather1

    51grandfather2son1,grandfather2

    52son1son1,grandfather2

    but I still have a problem trying to insert unique values

    into a parent-child (#bom) table in that

    levels after 1 need to know the parent in the parent-child table

    and the parent in treeval in order to determine its uniqueness

    and the relation.

    Thanks

  • Thanks Paul...I'll have a look at your links to see if it can

    shed any light on a solution.

  • Patrick Dadey (3/20/2010)


    I think my problem is a bit different

    than the solution you provided in that the id of the parent

    for the child in #tree is not known

    Maybe I'm seriously misunderstanding but without that information, I'm not sure how you have a hierarchy.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hi Jeff

    I'm trying to extract a heirarcy from the csv value within each row

    where you have a natural hierarchy eg grandson1->son1->grandfather1

    and then relate that to the other rows in the table (given that the root is unique).

    I think I need to code something like:

    Separate the csv into indivdual rows into table A with attributes for:

    level, parsed value, immediate parent and original csv

    level val parent csv

    1 grandfather1 null grandson1,son1,grandfather1

    2 son1 grandfather1 grandson1,son1,grandfather1

    3 grandson1 son1 grandson1,son1,grandfather1

    Populate table B with a identity id and parent_id (set to null) and

    the values from table A.

    Update parent_id in table B with

    find the id of the parent where child.parent = parent.val and child.val in parent.csv

    pYak

  • Well I've finally figured it out.

    Thanks to everyone for the pointers.

    I've posted the solution below for anyone who may have a similar problem.

    Its not elegant but does the job.

    --drop temporary tables

    IF OBJECT_ID('TempDB..#tree') IS NOT NULL

    DROP TABLE #tree;

    CREATE TABLE #tree

    (

    IDINT IDENTITY (1,1)

    , treeNVARCHAR(1000)

    );

    IF OBJECT_ID('TempDB..#bom') IS NOT NULL

    DROP TABLE #bom;

    CREATE TABLE #bom

    (

    IDINT IDENTITY (1,1)

    ,parent_idINTNULL

    ,lvlTINYINT

    ,valNVARCHAR (1000)

    ,preParentNVARCHAR (1000)

    ,parentNVARCHAR (1000)NULL

    );

    insert into #tree select 'grandson1,son1,grandfather1'

    insert into #tree select 'grandson2,son1,grandfather1'

    insert into #tree select 'son2,grandfather1'

    insert into #tree select 'grandson3,daughter1,grandfather1'

    insert into #tree select 'son1,grandfather2'

    insert into #tree select 'granddaughter1,daughter1,grandfather2'

    insert into #tree select 'grandson1,daughter1,grandfather2'

    insert into #tree select 'grandson2,son2,grandfather2'

    ;WITH cteSplit AS

    (

    SELECT

    [tree_id]= #tree.id

    , [lvl]= ROW_NUMBER () OVER (PARTITION BY tree ORDER BY (SELECT 1))

    , [pat]= #tree.tree

    , [val]= REVERSE(split.val)

    , [preParent]=

    REPLACE (

    SUBSTRING (tree, CHARINDEX(',' + REVERSE(split.val), tree) + 1 , LEN(tree))

    , REVERSE(',' + split.val), '')

    FROM #tree

    --dbo.Split splits csv values into rows

    --http://www.sqlservercentral.com/articles/XML/66932/

    CROSS APPLY dbo.Split(REVERSE(#tree.tree), ',') split

    )

    , cteParent

    AS

    (

    SELECT

    tree_id

    , lvl

    , val

    , pat

    , preParent

    , [parent]=

    CASE

    WHEN lvl = 1

    THEN NULL

    ELSE COALESCE(NULLIF(REPLACE

    (SUBSTRING(preParent,1,CHARINDEX(',',preParent))

    ,',','')

    ,''),preParent)

    END

    FROM cteSplit

    )

    , cteOrder

    AS

    (

    SELECT DISTINCT TOP 100 PERCENT

    parent_ID= NULL

    , lvl= cteParent.lvl

    , cteParent.val

    , cteParent.preParent

    , cteParent.parent

    , [sorter]= REVERSE(cteParent.preParent)

    FROM cteParent

    LEFT JOIN cteParent AS child

    ON child.lvl= cteParent.lvl - 1

    AND cteParent.parent = child.val

    AND cteParent.parent IS NULL

    ORDER BY

    cteParent.lvl

    , REVERSE(cteParent.preParent)

    )

    INSERT INTO #bom

    (

    parent_ID

    , lvl

    , val

    , preParent

    , parent

    )

    SELECT

    parent_ID

    , lvl

    , val

    , preParent

    , parent

    FROM cteOrder

    UPDATE #bom

    SET parent_id = parent.id

    FROM #bom

    LEFT JOIN #bom AS parent

    ON #bom.parent = parent.val

    AND CHARINDEX(parent.val, #bom.preParent) > 0

    SELECT

    parent_ID

    , lvl

    , val

    , preParent

    , parent

    FROM #bom

  • Thanks for posting the solution. I apologize... for some reason I wasn't getting the fact that you were trying to create the Adjacency Set from the Hierarchical Path. More coffee please... :blush:

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I haven't had time to analyse that solution fully, but I sure hope it doesn't depend on TOP 100 PERCENT...ORDER BY working as it did in SQL Server 2000. See the following blog entry by the SQL Server Query Optimization Team...

    TOP 100 Percent Considered Harmful

  • Hi Jeff.

    As I said the solution works for me but is not very elegant.

    Would you know of any other solutions that tackle the same problem?

    Thanks

  • Hi Paul

    Your right about the top 100 percent....

    I'll have another go when I have time to find a work around.

    pYak

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply