SQLServerCentral Article

CTE Performance

,

CTE Performance on Recursive Queries

Hierarchy structures are widely used in data model and SQL Server implementation for real world entities like manageremployee relation, organizational structures, regional structures etc. In order to get all the descendants for a node in the hierarchy, we need to recursively query the children of the node, and the children of the children, and so on until we reach the leaf level. In SQL Server 2000, this is achieved by loops traversing all the levels under the node. SQL 2005 provides the new feature Common Table Expression (CTE), which can be used to solve this request.

Introduction

When CTE is used in recursive query, it consists of at least one anchor member and one recursive member, e.g., in the following sample that gets all the employees under the "Vice President of Production",

  • [1] is the anchor member that queries the record for "Vice President of Production";
  • [2] is the recursive member by referencing the CTE name AllEmps;
  • [3] is the execution of the CTE;
USE AdventureWorks;
GO
WITH AllEmps (ManagerID, EmployeeID, ManagerTitle, EmpTitle, [Level])
AS
(
--[1]
SELECT E.ManagerID, E.EmployeeID, F.Title, E.Title, 0 AS [Level]
FROM HumanResources.Employee AS E,HumanResources.Employee AS F
WHERE E.Title='Vice President of Production' AND E.ManagerID=F.EmployeeID
UNION ALL
--[2]
SELECT E.ManagerID, E.EmployeeID, A.EmpTitle, E.Title, A.Level + 1
FROM HumanResources.Employee AS E
INNER JOIN AllEmps AS A ON E.ManagerID = A.EmployeeID
)
-- [3]
SELECT ManagerID, EmployeeID, ManagerTitle, EmpTitle, [Level]
FROM AllEmps
ORDER BY [Level],ManagerTitle,EmpTitle
GO

Though CTE provides a simple and neat solution for recursive queries in SQL 2005, we need to know its performance implication before we migrate existing code.

Performance Test

Hierarchy structures can be level intensive, i.e. the tree has many levels with fewer children under each node, or sub-node intensive, i.e. the tree has fewer levels with many children under each node. Tests are performed for these two extreme scenarios.

Test Environment

  • Hardware: Desk top PC with P4 1.60GHz CPU, 1GB RAM, 200GB IDE hard drive without software or hardware RAID.
  • Software: SQL server 2005 developer edition with SP2 on Windows 2003 Standard Edition;
  • Table schema and the stored procedures (script):

The script creates one table and stored procedures used by the test. The table schema is as follow:

CREATE TABLE dbo.Groups
 ( GroupID int identity NOT NULL,
 GroupName nvarchar(100),
 ParentGroupID int NULL,
 [Description] nvarchar(150)
 )
ALTER TABLE dbo.Groups ADD CONSTRAINT PK_Groups PRIMARY KEY CLUSTERED (GroupID)
CREATE INDEX IX_Groups_ParentGroupID ON dbo.Groups (ParentGroupID)

The stored procedures used for descendants queries are dbo.GetDescendantsCTE and dbo.GetDescendantsLoop, which query all the descandant groups for a given group by CTE and loop respectively.

The stored procedures used for ancestors queries are dbo.GetAncestorsCTE and dbo.GetAncestorsLoop, which query all the ancestor groups for a given group by CTE and loop respectively.

Test Case Design

Tests to query descendants:

 

Script to populate

data

Roots

Children

Total levels

Total records

Test

1

CTETestPopulateData1.txt

2

2

10

2046

Test

2

CTETestPopulateData2.txt

2

3

10

59048

Test

3

CTETestPopulateData3.txt

2

12

4

3770

For each test case, nodes of different levels in the tree are selected and its descendants are queried by the two stored procedures respectively.

Test to query ancestors:

 

Script to populate

data

Roots

Children

Total levels

Total records

Test

5

CTETestPopulateData2.txt

2

3

10

59048

Test Results

Results are shown in the following table and chart.

Tests

Levels

Rows Returned

ByCTE(ms)

ByLoop(ms)

CTE:Loop

Test 1

10

1023

359

214

1.6776

9

511

317

166

1.9096

6

63

90

63

1.4286

Test 2

9

9841

3743

1601

2.3379

8

3280

1136

703

1.6159

7

1093

467

306

1.5261

6

364

291

203

1.4335

4

40

125

46

2.7174

3

13

<3

<3

1.0000

Test 3

4

1885

2526

255

9.9059

3

157

258

77

3.3506

The results indicate that CTE is much slower than loop. When the number of children increases under each node, the performance of CTE gets worse quickly.

The following images are the actual execution plans of the recursive part for both methods. The plans do not give any clue to their performance difference. However, both plans have expensive key lookup operator due to the index seek operator in the plans.

To investigate whether key lookup on CTE and loop recursive queries has different behavior, I created two new stored procedures (dbo.GetDescendantsCTE2 and dbo.GetDescendantsLoop2) to keep only the key column GroupID of the index and [Level] in the CTE and the table variable. Other output columns are retrieved by join the CTE (table variable) with the table in the select statement, where clustered index seek is used:

SELECT G.GroupID, G.ParentGroupID, G.GroupName, P.GroupName AS ParentGroupName, D.[Level]
FROM Descendants D INNER JOIN dbo.Groups G ON D.GroupID=G.GroupID
LEFT JOIN dbo.Groups P ON G.ParentGroupID=P.GroupID
ORDER BY [Level],ParentGroupID,GroupID

Run CTETestPopulateData2.txt to populate test data, and run the "Test4" in both CTETestByCTE.txt and CTETestByLoop.txt. The following is the new recursive part execution plan for CTE. Both plans for CTE and loop eliminated the key lookup operator.

The results are showed below. The performance of both methods is improved by removing the key lookup.

Levels

Rows Returned

Test#

ByCTE(ms)

ByLoop(ms)

CTE:Loop

9

9841

Test2

3743

1601

2.3379

Test4

2056

1402

1.466

Improvements

82%

14%

 

8

3280

Test2

1136

703

1.6159

Test4

823

547

1.5046

Improvements

38%

28%

 

In the query ancestor test, run CTETestPopulateData2.txt to populate the data, and run the "Test5" in both CTETestByCTE.txt and CTETestByLoop.txt. The result shows that performance of CTE is as good as, if not better than, traditional loop methods. I didnt test trees with more than 10 levels.

Tests

Levels

Rows Returned

ByCTE(ms)

ByLoop(ms)

CTE:Loop

Test 5

10

10

<3

<3

1.0

A test (script not included) to query descendants on a 10 level tree with one child per node shows that the performance of CTE and loop is also comparable.

Conclusion

CTE is a neat way to implement recursive queries. However its performance is generally more than 50% slower than the recursive query by the traditional loop method if recursive part of the query returns more than 1 record for each node like the query for descendants of a node in a tree. When the number of children under a node is increasing, the performance of CTE degrades even more quickly in the mentioned scenario.

If the recursive part returns one record for each node, e.g. queries for ancestors of a node in a tree, the performance of CTE is comparable as loop method.

In the recursive part of a CTE or loop method, if index seek is used in the execution plan, only including columns covered by indexes of the tables involved in the recursive query can improve performance.

Before migrating the recursive queries to CTE in your database, you need to understand the scenarios how data is queried and how heavy the recursive query load is in your system. In a busy OLTP system, 10ms degradation for highly concurrent calls can become bottleneck of the whole server.

Rate

4.67 (18)

You rated this post out of 5. Change rating

Share

Share

Rate

4.67 (18)

You rated this post out of 5. Change rating