Hierarchies in SQL, Part II, the Sequel

  • There are some people who insist on building the Nested Sets data in a single query. I don’t know why, since this does it accurately and in about 1 second on my machine. Surely that’s fast enough and efficient enough.

    Yowch. I'm not sure what's going on but I have an I5 laptop w/6GB of RAM and I'm running SQL Server 2008 SP3 (version 10.0.5500.0). It' takes 00:02:34 for the first merge of the 10,000 row example to run and the execution plan shows an internal arrow with almost a 90 million rowcount. I thought it might be some form of parameter sniffinng and even restarted the instance but to no avail. Any idea what might be going on there?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (9/10/2012)


    There are some people who insist on building the Nested Sets data in a single query. I don’t know why, since this does it accurately and in about 1 second on my machine. Surely that’s fast enough and efficient enough.

    Yowch. I'm not sure what's going on but I have an I5 laptop w/6GB of RAM and I'm running SQL Server 2008 SP3 (version 10.0.5500.0). It' takes 00:02:34 for the first merge of the 10,000 row example to run and the execution plan shows an internal arrow with almost a 90 million rowcount. I thought it might be some form of parameter sniffinng and even restarted the instance but to no avail. Any idea what might be going on there?

    Yes, but only because I messed up and copy-and-pasted the obsolete script into the article! The one that I said not to use any more a few paragraphs later. I'm guess that might be a bit confusing ....

    Replace the second script with this:

    WITH Hierarchy(ID, PID, Lvl, Pth)

    AS (SELECT ID,

    NULL,

    0,

    '/' + CAST(ID AS VARCHAR(MAX)) + '/'

    FROM dbo.HierarchyTest

    WHERE ParentID IS NULL

    UNION ALL

    SELECT HSub.ID,

    HSub.ParentID,

    Hierarchy.Lvl + 1,

    Hierarchy.Pth + CAST(HSub.ID AS VARCHAR(MAX)) + '/'

    FROM dbo.HierarchyTest AS HSub

    INNER JOIN Hierarchy

    ON HSub.ParentID = Hierarchy.ID)

    MERGE dbo.HierarchyTest AS H

    USING

    (SELECT ID,

    PID,

    Lvl,

    Pth FROM Hierarchy) AS Paths

    ON H.ID = Paths.ID

    WHEN MATCHED

    THEN UPDATE

    SET H.HID = Paths.Pth ;

    Has nothing to do with the Nested Sets. That's the third part of the script. This part is only meant to generate the HierarchyID data.

    The Nested Sets bit is:

    WITH CTE

    AS (SELECT ID,

    ROW_NUMBER() OVER (ORDER BY HT1.HID) AS R

    FROM dbo.HierarchyTest AS HT1

    WHERE RangeStart IS NULL

    AND RangeEnd IS NULL

    AND NOT EXISTS ( SELECT *

    FROM dbo.HierarchyTest AS HT2

    WHERE HT2.ParentID = HT1.ID ))

    UPDATE Tgt

    SET RangeStart = R,

    RangeEnd = R

    FROM dbo.HierarchyTest AS Tgt

    INNER JOIN CTE

    ON CTE.ID = Tgt.ID ;

    WHILE @@ROWCOUNT > 0

    BEGIN

    WITH CTE

    AS (SELECT

    HT1.ID,

    MIN(HT2.RangeStart) AS RS,

    MAX(HT2.RangeEnd) AS RE

    FROM dbo.HierarchyTest AS HT1

    INNER JOIN dbo.HierarchyTest AS HT2

    ON HT1.ID = HT2.ParentID

    WHERE HT1.RangeStart IS NULL

    AND HT1.RangeEnd IS NULL

    AND HT2.RangeStart IS NOT NULL

    AND HT2.RangeEnd IS NOT NULL

    GROUP BY HT1.ID)

    UPDATE Tgt

    SET RangeStart = CTE.RS,

    RangeEnd = CTE.RE

    FROM dbo.HierarchyTest AS Tgt

    INNER JOIN CTE

    ON Tgt.ID = CTE.ID ;

    END ;

    That's correct in the article. It's the HID script that's wrong.

    On a 10-thousand row hierarchy (as per that part of the article), generating the Nested Sets data takes less than 3 milliseconds on my desktop machine (Core i7 Quad, 16 Gig of RAM). Can't tell how much less, since I tested it by setting a variable to GetDate at the beginning then select the DateDiff in milliseconds, and it came up 0. That means less than 3 milliseconds, as per usual rules on accuracy of GetDate.

    Sorry for the incorrect script. Not sure how I missed that mistake.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Good article.

  • Thanks.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

Viewing 4 posts - 16 through 18 (of 18 total)

You must be logged in to reply to this topic. Login to reply