Displaying hierarchical data has been made easy with Reporting Services 2008. It usually involves creating a self-referenced table with ParentID
associated to the primary key (e.g., ManagerID
to EmployeeID
) and then setting the recursive parent in Reporting Services. To format the hierarchical data, a common approach is to indent descendants and add visibility toggles to the ancestors:
This seems to be straightforward enough; however, if you try to render a large dataset with thousands of rows, you will notice that the entire dataset is rendered on one page and causes performance issues. The performance is poor because the automatic paging based on the report’s interactive size is disabled when the visibility toggle is turned on. This limitation makes it difficult to efficiently display a complex hierarchy. In this article, we will present an approach leveraging the HIERARCHYID
data type to store the hierarchical data and performs custom paging in SQL server.
SQL scripts, Reporting Services solution, and the sample database discussed in this article are available for download.
Prestage Data
To simulate complex hierarchical structural, we will use the Customer table from AdventureWorks database:
SELECT CustomerID, FirstName + ' ' + COALESCE(MiddleName, '') + ' ' + LastName + ' Inc.' AS CustomerName, ABS(CAST(NEWID() AS BINARY(6)) % 29484) AS ParentCustomerID INTO Customer$ FROM AdventureWorks.Sales.vIndividualCustomer GROUP BY CustomerID, FirstName, MiddleName, LastName
Note that we added a new field, ParentCustomerID
populated by a random number between 0 and 29483 (max. of CustomerID) to serve as our self-referencing ID. Since AdventureWorks’ CustomerID starts at 11,000, any ParentCustomerID
below 11,000 are not valid. To fix this, we will add a root node to the table and set invalid ParentCustomerID
to the newly added root node:
INSERT Customer$ VALUES (-1, 'root', NULL) UPDATE Customer$ SET ParentCustomerID = -1 -- root node WHERE ParentCustomerID < 11000 -- 11000 is MIN CustomerID
Model Hierarchical Data
To model our data using the HIERARCHYID data type, we need to populate the field with nodes mimicking a tree structure:
This can be done by recursive CTE:
;WITH P(CustomerID, CustomerName, ParentCustomerID, Node) AS ( SELECT CustomerID, CustomerName, ParentCustomerID, HIERARCHYID::GetRoot() AS Node FROM Customer$ WHERE ParentCustomerID IS NULL UNION ALL SELECT C.CustomerID, C.CustomerName, C.ParentCustomerID, CAST(P.Node.ToString() + CAST(ROW_NUMBER() OVER (PARTITION BY C.ParentCustomerID ORDER BY C.CustomerID) AS VARCHAR(30)) + '/' AS HIERARCHYID) FROM Customer$ AS C INNER JOIN P ON C.ParentCustomerID = P.CustomerID ) SELECT * INTO Customer FROM P
To optimize the table for efficiency, add a primary key and indicies with the following code listing:
-- Add PK ALTER TABLE Customer ALTER COLUMN CustomerID INT NOT NULL ALTER TABLE Customer ADD CONSTRAINT PK_Customer PRIMARY KEY CLUSTERED (CustomerID ASC) -- Add computed field for BFS index ALTER TABLE Customer ADD HierarchyLevel AS Node.GetLevel() -- Add breadth-first index CREATE UNIQUE INDEX IX_Customer_BFS ON Customer(HierarchyLevel, Node) -- Add depth-first index CREATE UNIQUE INDEX IX_SiebelAccount_DFS ON Customer(Node)
More detail on HIERARCHYID can be found at:
http://www.sqlservercentral.com/articles/SQL+Server+2008/62204/
http://www.sqlservercentral.com/articles/SQL+Server+2008/67787/
http://technet.microsoft.com/en-us/library/cc627504(SQL.100).aspx
Display Hierarchal Data
With the hierarchal data populated, we can start creating a report. For the demonstration purpose, we will simply use count of descendants as the aggregate in the report.
- Add a stored procedure to SQL Server for retrieving customer hierarchies:
CREATE PROCEDURE dbo.GetCustomers_1 AS BEGIN SELECT CustomerID, CustomerName, CASE WHEN ParentCustomerID = -1 THEN NULL ELSE ParentCustomerID END AS ParentCustomerID FROM Customer WHERE HierarchyLevel > 0 END
- Create a new report. Add a new Data Source pointing to our database
Organization
, then add a new dataset Customers using the stored proceduredbo.GetCustomers_1
. - In the report design view, drag a matrix to the canvas.
- For the simplicity, delete
[Column Group]
from the column groups. - Drag
CustomerID
to the Rows quadrant of the tablix to create a row group onCustomerID
. Change the row group name toRowGroup_CustomerID
. Change the row group text box association fromCustomerID
toCustomerName
.
Right click on
[RowGroup_CustomerID]
, then click on Group Properties:a. Under the Sorting tab, sort the group by
[CustomerName]
.b. Under Advanced tab, use set
ParentCustomerID
as recursive parent.Add left paddings to textbox [CustomerName]:
=CStr(5 + (Level()*12)) + "pt"
In the Data quadrant of the tablix, set the expression to
=Count(Fields!CustomerID.Value, "RowGroup_CustomerID", Recursive)
.
Run the report and note that the report is broken up into 420 pages due to the default interactive size of 8.5in x 11in.
Now, let’s add toggles to the ancestors to create drill-down effect. Right click on [RowGroup_CustomerID]
, click on Group Properties. Under the Visibility tab, set “When the report is initially run” to Hide, check “Display can be toggled by this report item”, and select CustomerName from the dropdown menu. Run the report again.
It should be obvious that there is a problem. Although the visibility toggle displays as expected, the pagination disappears. In Visual Studio, the report renders our testing dataset up to Chad C Jai Inc., while when deployed, the report renders the entire dataset on one page. This can cause a performance issue when working with complex hierarchy. Specifically, in our testing environment, the average refresh time for clicking on the visibility toggle is around 410 ms.
Server-Side Pagination
The performance issue we see in the previous section can actually be resolved by server-side paging. In this section, we will create a new stored procedure, dbo.GetCustomers_2
, with the following code snippets to calculate page number for each record in SQL Server and the return that to Reporting Services.
Because of the nature of hierarchical data, it makes sense to group all descendants on the same page as their ancestors. To that end, we need to know top-level ancestors’ page number first:
-- :::::::::::::::::::::::::::::::::::::::::::::::::::::: -- Prestage the returning dataset -- :::::::::::::::::::::::::::::::::::::::::::::::::::::: SELECT CustomerID, CustomerName, CASE WHEN ParentCustomerID = -1 THEN NULL ELSE ParentCustomerID END AS ParentCustomerID, Node, NULL AS Page INTO #R FROM Customer WHERE HierarchyLevel > 0 -- :::::::::::::::::::::::::::::::::::::::::::::::::::::: -- Set page number for top-level nodes -- :::::::::::::::::::::::::::::::::::::::::::::::::::::: ;WITH I AS ( SELECT CustomerID, (ROW_NUMBER() OVER (ORDER BY CustomerName))/100 + 1 AS Page FROM #R WHERE ParentCustomerID IS NULL ) UPDATE #R SET Page = I.Page FROM #R INNER JOIN I ON #R.CustomerID = I.CustomerID
For simplicity, we hardcoded page size to be 100 in this example, but the value can be parameterized in a real world application. After the top-level ancestors’ page numbers are determined, we can now propagate the page numbers to their descendants. To keep this process clear, we will first set aside the top-level ancestors' page numbers:
SELECT CustomerID AS TopAncestorId, Node AS TopAncestorNode, Page INTO #T FROM #R WHERE Page IS NOT NULL
Then we will use HIERARCHYID
’s GetAncestor()
method to JOIN descendants to their top-level ancestors and set the page numbers:
UPDATE R SET Page = T.Page FROM #R R INNER JOIN #T T ON R.Node.GetAncestor(R.Node.GetLevel() - 1) = TopAncestorNode WHERE R.Page IS NULL
SELECT * FROM #R
Configure Report with Server-Side Pagination
After creating the new stored procedure to calculate page numbers, we will modify our report to add a new row group outside of [RowGroup_CustomerID]
for pagination. First, swap out the stored procedure for the dataset to use dbo.GetCustomers_2
and refresh fields. Then right-click on [RowGroup_CustomerID], hover Add Group, then click on Parent Group. In the Group by dropdown, select Page and then click on OK. Right click on the newly added row group [Page]
and click on Group Properties. Change the default group name Page to RowGroup_Page. Under the Page Breaks tab, check Between each instance of a group. Click on OK to exit out the Group Properties panel. You will notice that there is a new column added to the tablix for [RowGroup_Page]
. Since we only need the group for paging not for display, right click on the column, click on Delete Columns, and choose Delete columns only. Run the report again and you will notice that the dataset is nicely grouped into 70 pages. In our testing environment, the average refresh time for clicking on the visibility toggle is reduced to around 5 ms.
Conclusion
This article presents an approach to perform server-side pagination for displaying hierarchal data in Reporting Services. By applying the technique, the report performance can be greatly improved. In our testing environment, for example, the server-side pagination results in 98% improvement in efficiency. We hope this article can benefit developers who find the need to display complex hierarchical data in Reporting Services.