Traversing a hierarchy

  • I have a table representing parent-child relationship and NoOfBooks for that relationship. Each child can then have its own children and so on. Each record has a value for NoOfBooks ranging from 0 to 100.

    What i want is for a parent (we can start with any parent), identify all relationships of the parent where sum of NoOfBooks for a child is greater than a threshold value (let's say 9). But the catch is, sum of NoOfBooks for a child's each parent should also be greater that the threshold value.

    So e.g.

    A-B 9

    A-C 10

    A-D 5

    B-E 5

    C-E 6

    E-F 12

    E-G 4

    D-H 14

    My Output should be

    A-B 9

    A-C 10

    B-E 5

    C-E 6

    E-F 12

    Here is the table with sample data.

    Create table MyTable (

    Id INT IDENTITY(1,1) Primary Key,

    ParentId INT,

    ChildId INT,

    NoOfBooks INT

    );

    INSERT INTO MyTable VALUES (1000, 2000, 10);

    INSERT INTO MyTable VALUES (1000, 2001, 9);

    INSERT INTO MyTable VALUES (1000, 2002, 8);

    INSERT INTO MyTable VALUES (2000, 2003, 4);

    INSERT INTO MyTable VALUES (2001, 2003, 5);

    INSERT INTO MyTable VALUES (2003, 2004, 9);

    INSERT INTO MyTable VALUES (2000, 2005, 3);

    INSERT INTO MyTable VALUES (2001, 2005, 3);

    INSERT INTO MyTable VALUES (2004, 2005, 3);

    INSERT INTO MyTable VALUES (2005, 2006, 5);

    INSERT INTO MyTable VALUES (2002, 2007, 12);

    INSERT INTO MyTable VALUES (2005, 2008, 5);

    INSERT INTO MyTable VALUES (2002, 2008, 6);

    Required output of above sample data (see attachment)

    Note that [1000, 2002, 8], [2005, 2006, 5], [2002, 2007, 12], [2005, 2008, 5], [2002, 2008, 6] are not included in my required output because either SUM(NoOfBooks) for the child is NOT greater than the threshold value OR SUM(NoOfBooks) for each of its parent is NOT greater than the threshold value.

  • I think you need to explain more what you want.

    You talk about two different measurements:

    A = Sum(NoOfBooks) for a child

    B = Sum(NoOfBooks) for a child's each parents

    Could you tell us exactly how these two values are calculated for each row in your sample data?

    Once we know exactly how these values are calculated it should be simple to formulate a query like this:

    SELECT * FROM MyTable WHERE A>9 AND B>9

  • A = Sum(NoOfBooks) for a child

    B = Sum(NoOfBooks) for a child's each parents

    Could you tell us exactly how these two values are calculated for each row in your sample data?

    That is exactly what i am looking for, i just provided the sample output by calculating it.

  • Here is a batch. I'm using temp tables, because I estimate that a single-query solution would be less efficient.

    Note that the solution does not include rows with ParentId = 1000, as sum(NoOfBooks) id = 1000 is 0.

    Create table MyTable (

    ParentId INT NOT NULL,

    ChildId INT NOT NULL,

    NoOfBooks INT NOT NULL,

    PRIMARY KEY (ParentId, ChildId)

    );

    INSERT INTO MyTable VALUES (1000, 2000, 10);

    INSERT INTO MyTable VALUES (1000, 2001, 9);

    INSERT INTO MyTable VALUES (1000, 2002, 8);

    INSERT INTO MyTable VALUES (2000, 2003, 4);

    INSERT INTO MyTable VALUES (2001, 2003, 5);

    INSERT INTO MyTable VALUES (2003, 2004, 9);

    INSERT INTO MyTable VALUES (2000, 2005, 3);

    INSERT INTO MyTable VALUES (2001, 2005, 3);

    INSERT INTO MyTable VALUES (2004, 2005, 3);

    INSERT INTO MyTable VALUES (2005, 2006, 5);

    INSERT INTO MyTable VALUES (2002, 2007, 12);

    INSERT INTO MyTable VALUES (2005, 2008, 5);

    INSERT INTO MyTable VALUES (2002, 2008, 6);

    CREATE TABLE #sumbooks(nodeid int NOT NULL PRIMARY KEY,

    sumbooks int NOT NULL)

    DECLARE @threshold int = 9

    INSERT #sumbooks(nodeid, sumbooks)

    SELECT ChildId, SUM(NoOfBooks)

    FROM MyTable

    GROUP BY ChildId

    HAVING SUM(NoOfBooks) >= @threshold

    SELECT * FROM #sumbooks

    DELETE #sumbooks

    FROM #sumbooks a

    WHERE EXISTS (SELECT *

    FROM MyTable MT

    LEFT JOIN #sumbooks b ON MT.ParentId = b.nodeid

    WHERE MT.ChildId = a.nodeid

    AND isnull(b.sumbooks, 0) < @threshold)

    SELECT * FROM #sumbooks

    ; WITH rekurs AS (

    SELECT StartingParent = 1000, ParentId, ChildId, NoOfBooks

    FROM MyTable

    WHERE ParentId = 1000

    UNION ALL

    SELECT r.StartingParent, MT.ParentId, MT.ChildId, MT.NoOfBooks

    FROM rekurs r

    JOIN MyTable MT ON r.ChildId = MT.ParentId

    )

    SELECT DISTINCT r.*

    FROM rekurs r

    WHERE EXISTS (SELECT *

    FROM #sumbooks s

    WHERE s.nodeid = r.ChildId)

    ORDER BY r.ChildId, r.ParentId

    go

    DROP TABLE #sumbooks

    DROP TABLE MyTable

    [font="Times New Roman"]Erland Sommarskog, SQL Server MVP, www.sommarskog.se[/font]

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply