Denormalising report from normalised nested source data

  • Hi

    I have an application that stores normalised and nested data. Each top level instance is identified by an instance Id. Each line of data has a Lineage and NestedIndexId such that it's child has the same instance Id with the Lineage = Parent Lineage+'_'+ Parent NestedIndexId. The data item is indicated by columns [DataName] and [DataValue]

    This structure allows a flexible data storage for any number of records and level of nesting. Using an object model on the application side to represent and manage this is easy enough, but I am battling with a way to run a flat report to get the data out in a usable format.

    Hopefully the code below describes what I am trying to explain, together with where I am with pivoting the data:

    CREATE TABLE #Data (InstanceId INT, Lineage VARCHAR(12), IndexId INT, DataName VARCHAR(20),DataValue VARCHAR(100))

    INSERT INTO #Data

    ( InstanceId

    , Lineage

    , IndexId

    , DataName

    , DataValue

    )

    VALUES ( 1, '0' ,0,'Parent','1 Parent 0')

    ,( 1, '0_0' ,0,'Child','1 Child 0 0')

    ,( 1, '0_0_0',0,'Grandchild','1 Grandchild 0 0')

    ,( 1, '0_0_0',1,'Grandchild','1 Grandchild 0 1')

    ,( 1, '0_0' ,1,'Child','1 Child 0 0')

    ,( 1, '0_0_1',0,'Grandchild','1 Grandchild 1 0')

    ,( 1, '0_0_1',1,'Grandchild','1 Grandchild 1 1')

    ,( 2, '0' ,0,'Parent','2 Parent 0')

    ,( 2, '0_0' ,0,'Child','2 Child 0 0')

    ,( 2, '0_0_0',0,'Grandchild','2 Grandchild 0 0')

    ,( 2, '0_0_0',1,'Grandchild','2 Grandchild 0 1')

    ,( 2, '0_0' ,1,'Child','2 Child 0 0')

    ,( 2, '0_0_1',0,'Grandchild','2 Grandchild 1 0')

    ,( 2, '0_0_1',1,'Grandchild','2 Grandchild 1 1')

    SELECT * FROM

    (

    SELECT InstanceId

    ,Lineage

    ,IndexId

    ,DataValue

    ,DataName

    FROM#Data

    ) SourceData

    PIVOT ( MAX(DataValue) FOR DataName in ([Parent],[Child],[GrandChild])) As PVT

    ORDER BY InstanceId,Lineage + '_'+CAST(IndexId AS VARCHAR(12))

    DROP TABLE #Data

    I need the output of the query to be formatted in the following fashion:

    Parent Child GrandChild

    1 Parent 0 1 Child 0 01 Grandchild 0 0

    1 Parent 0 1 Child 0 01 Grandchild 0 1

    1 Parent 0 1 Child 0 01 Grandchild 1 0

    1 Parent 0 1 Child 0 01 Grandchild 1 1

    2 Parent 0 2 Child 0 02 Grandchild 0 0

    2 Parent 0 2 Child 0 02 Grandchild 0 1

    2 Parent 0 2 Child 0 02 Grandchild 1 0

    2 Parent 0 2 Child 0 02 Grandchild 1 1

    but the closes I am able to get from a query like the one above is

    InstanceIdLineageIndexIdParentChild GrandChild

    1001 Parent 0

    10_001 Child 0 0

    10_0_001 Grandchild 0 0

    10_0_011 Grandchild 0 1

    10_011 Child 0 0

    10_0_101 Grandchild 1 0

    10_0_111 Grandchild 1 1

    2002 Parent 0

    20_002 Child 0 0

    20_0_002 Grandchild 0 0

    20_0_012 Grandchild 0 1

    20_012 Child 0 0

    20_0_102 Grandchild 1 0

    20_0_112 Grandchild 1 1

    Any ideas appreciated

    Gavin

  • Have a look at this article in order to use an effective hierarchical model in SQL Server.

    http://msdn.microsoft.com/en-us/magazine/cc794278.aspx

  • @gavin.cox,

    First, very nice job on providing readily consumable data.

    My question to you is, will there only be 3 levels in this data?

    Also, please post the code you used for your "closest" attempt. It looks like with just a tweek to that code, we could have this for you.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hi Jeff

    Thanks for the reply. The structure allows for any level of nesting so there may be more than 3 levels. As such the query will return whatever columns need returning to represent the underlying data in a denormalised, flat output.

    I actually use a dynamic SQL in production (I know, I know...) which is the only way to dynamically flatten the data generically. The "nesting" is new functionality. The dynamic and unintrusive nature of declaring new "columns" and nestings is great - it's just the generic reporting that's a pain.

    Below is an amendment that more closely resembles what I'm doing in production.

    /*CREATE DATA TABLE*/

    CREATE TABLE ##Data

    (

    InstanceId INT

    , Lineage VARCHAR(12)

    , IndexId INT

    , DataName VARCHAR(20)

    , DataValue VARCHAR(100)

    )

    /*POPULATE DATA TABLE*/

    INSERT INTO ##Data

    ( InstanceId, Lineage, IndexId, DataName, DataValue )

    VALUES ( 1, '0', 0, 'Parent', '1 Parent 0' )

    ,( 1, '0_0', 0, 'Child', '1 Child 0 0' )

    ,( 1, '0_0_0', 0, 'Grandchild', '1 Grandchild 0 0' )

    ,( 1, '0_0_0', 1, 'Grandchild', '1 Grandchild 0 1' )

    ,( 1, '0_0', 1, 'Child', '1 Child 0 0' )

    ,( 1, '0_0_1', 0, 'Grandchild', '1 Grandchild 1 0' )

    ,( 1, '0_0_1', 1, 'Grandchild', '1 Grandchild 1 1' )

    ,( 2, '0', 0, 'Parent', '2 Parent 0' )

    ,( 2, '0_0', 0, 'Child', '2 Child 0 0' )

    ,( 2, '0_0_0', 0, 'Grandchild', '2 Grandchild 0 0' )

    ,( 2, '0_0_0', 1, 'Grandchild', '2 Grandchild 0 1' )

    ,( 2, '0_0', 1, 'Child', '2 Child 0 0' )

    ,( 2, '0_0_1', 0, 'Grandchild', '2 Grandchild 1 0' )

    ,( 2, '0_0_1', 1, 'Grandchild', '2 Grandchild 1 1' )

    /*BUILD OUTPUT COLUMN LIST*/

    DECLARE @ColumnList NVARCHAR(MAX) = ''

    SELECT @ColumnList = @ColumnList +

    + CASE WHEN LEN(@ColumnList) > 0 THEN ',' ELSE '' END

    + '[' + DataName + ']'

    FROM (SELECT DISTINCT DataName FROM ##Data) SourceData

    PRINT @ColumnList

    /*REPORTING QUERY*/

    DECLARE @Sql NVARCHAR(MAX) = N'

    SELECT * FROM

    (

    SELECT InstanceId

    ,Lineage

    ,IndexId

    ,DataValue

    ,DataName

    FROM##Data

    ) SourceData

    PIVOT

    ( MAX(DataValue) FOR DataName in (' + @ColumnList + ')) As PVT

    ORDER BY InstanceId

    , Lineage + ''_'' + CAST(IndexId AS VARCHAR(12))

    '

    PRINT @Sql

    EXEC sp_ExecuteSql @Sql

    /*CLEAN UP*/

    DROP TABLE ##Data

    Thanks

    Gavin

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply