• Jason-299789 (3/1/2011)


    Hi,

    We are starting work on a BI project that is given data in a traditional Parent-Child Hierarchy format.

    The Problem is that we need to Flatten the Hierarchy out the method that we have is to create a delimited string via a standard recursive CTE, then using a UDF string spliter, with a sample record set of just 9 rows, and 4 levels, the query curently takes around 1.5 seconds.

    The actual record set we have to flatten is around 3-5K rows, and to complicate matters we need to run this as a low-latency, every 15 minutes, so its critical that we parse these structures as quickly as possible.

    I have managed to pivot the dataset and get the leaf nodes correct, by using this script

    WITH DirectReports(Name,Title, Manager, EmployeeID, ManagerId, EmployeeLevel,Sort)

    AS

    (SELECT CONVERT(varchar(255), e.FirstName + ' ' + e.LastName),

    e.Title,

    Convert(varchar, NULL) Manager,

    e.EmployeeID,

    convert(int,Null) ManagerId,

    0,

    CONVERT(varchar(255), '\'+e.FirstName + ' ' + e.LastName)

    FROM dbo.MyEmployees AS e

    WHERE e.ManagerID IS NULL

    UNION ALL

    SELECT

    CONVERT(varchar(255), e.FirstName + ' ' + e.LastName),

    e.Title,

    Convert(varchar,d.Name) name,

    e.EmployeeID,

    convert(int,d.EmployeeID) as ManagerId,

    EmployeeLevel + 1,

    CONVERT (varchar(255), RTRIM(Sort) + '\' + FirstName + ' ' + LastName)

    FROM dbo.MyEmployees AS e

    JOIN DirectReports AS d ON e.ManagerID = d.EmployeeID

    ),

    Piv AS (

    Select *

    FROM (Select d.EmployeeId,d.ManagerId,d.Name, d.EmployeeLevel,d.EmployeeLevel c,Sort

    from DirectReports d

    ) T

    PIVOT (MAX(Name) For EmployeeLevel IN ([0],[1],[2],[3],[4],[5],[6],[7],[8],[9])) p

    )

    Select EmployeeId,[0],[1],[2],[3],[4]

    from Piv

    order by Sort

    The speed is about what we need at 1ms, however it returns this data set

    Id01234

    1Ken SánchezNULLNULLNULLNULL

    273NULLBrian WelckerNULLNULLNULL

    16NULLNULLDavid BradleyNULLNULL

    23NULLNULLNULLMary GibsonNULL

    274NULLNULLStephen JiangNULLNULL

    276NULLNULLNULLLinda MitchellNULL

    275NULLNULLNULLMichael BlytheNULL

    285NULLNULLSyed AbbasNULLNULL

    286NULLNULLNULLLynn TsofliasNULL

    as you can see the nodes are Parent Levels are blank.

    This is result set I'm that trying to get out.

    IdLevel0 Level1 Level2 Level3 Level4

    1Ken SánchezNULLNULLNULLNULL

    16Ken SánchezBrian WelckerDavid BradleyNULLNULL

    23Ken SánchezBrian WelckerDavid BradleyMary GibsonNULL

    273Ken SánchezBrian WelckerNULLNULLNULL

    274Ken SánchezBrian WelckerStephen JiangNULLNULL

    275Ken SánchezBrian WelckerStephen JiangMichael BlytheNULL

    276Ken SánchezBrian WelckerStephen JiangLinda MitchellNULL

    285Ken SánchezBrian WelckerSyed AbbasNULLNULL

    286Ken SánchezBrian WelckerSyed AbbasLynn TsofliasNULL

    Any pointers or advice gratefully accepted, I suspect theres a Self join back into the CTE or Source table But I cant for the life of me figure it out.

    Hi Jason,

    Yeah, I realize that this thread is well over a year old but I hope you're still around because I found it very interesting. What I'd really like to know is "why". That is, why did you need to present the hierarchical data in a flattened format like this?

    And, no... I'm not getting ready to start a "fight" over whether it's right or wrong. I'm just really curious as to what the actual business requirement to do this is. If you get the chance, I sure would like to hear about it.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)