Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 

Common table expressions and circular references

By Geoff Norman, (first published: 2011/01/31)

I was recently asked to do a bit of hierarchical analysis on Active Directory groups. No problem, I thought. I can extract group memberships into a table, then chuck the data through a CTE to give the hierarchy. In fact, it all went swimmingly until I hit a very basic problem; my CTE was failing when it hit the maxrecursion value. This indicated strongly that I had one or more circular references - groups that were members of themselves either directly or through inheritance.

Whilst defining the problem was easy, finding how to weed out the culprits was less so. Google seemed to give remarkably few results on how to actually find a circular reference. In fact, the solution is very straightforward. Hopefully my quick and dirty answer will be of some use.

I'll start off with a simple table containing two columns; child and parent.

CREATE TABLE ChildrenAndParents (Child [varchar](255) NOT NULL,
 Parent [varchar](255) NOT NULL)

INSERT INTO ChildrenAndParents (Child, Parent) values	('a', 'b')
INSERT INTO ChildrenAndParents (Child, Parent) values	('b', 'c')
INSERT INTO ChildrenAndParents (Child, Parent) values	('c', 'd')
INSERT INTO ChildrenAndParents (Child, Parent) values	('e', 'b')

The CTE I want to use to show the hierarchy of memberships in the table is one containing four columns; child, parent, level and hierarchypath. The definition is below:

WITH GroupMembers (Child, ParentGroup, Level, hierarchypath)
AS
(
-- Anchor member definition
SELECT g.child, g.parent, 0 AS Level, convert(varchar(max), g.child + '->' + g.parent) AS hierarchypath
FROM childrenandparents AS g
WHERE child not in (select parent from childrenandparents)
UNION ALL
-- Recursive member definition
SELECT g.child, g.parent, Level + 1, hierarchypath + '->' + g.parent
FROM childrenandparents as g
INNER JOIN GroupMembers AS gm
ON gm.parentgroup = g.child
)

select left(hierarchypath, charindex('->', hierarchypath) - 1) as child, parentgroup, level, hierarchypath
from groupmembers
option(maxrecursion 5);

This gives the result below. As you can see, it shows that, for instance, a is a member of d as a result of the path a->b->c->d.

child      parentgroup level       hierarchypath
---------- ----------- ----------- --------------------
a          b           0           a->b
e          b           0           e->b
e          c           1           e->b->c
e          d           2           e->b->c->d
a          c           1           a->b->c
a          d           2           a->b->c->d

Now, I'll put in a circular reference.

INSERT INTO ChildrenAndParents (Child, Parent) values	('d', 'b')

Of course, rerunning the CTE in this form fails, complaining with the error message

Msg 530, Level 16, State 1, Line 1
The statement terminated. The maximum recursion 5 has been exhausted before statement completion
.

This is because b is a child of c, c is a child of d and d is a child of b, hence creating a hierarchy that would go on infinitely. Of course, in reality this is the point at which I encountered the problem - having a circular reference within several thousand rows and needing to find it.

The first step I took was to find all the hierarchy unaffected by the circular references. In order to do this, all I need to do is add a "where" clause to the recursive bit of the CTE so that it excludes any records where the parent is already in the hierarchical path.

WITH GroupMembers (Child, ParentGroup, Level, hierarchypath)
AS
(
-- Anchor member definition
SELECT g.child, g.parent, 0 AS Level, convert(varchar(max), g.child + '->' + g.parent) AS hierarchypath
FROM childrenandparents AS g
WHERE child not in (select parent from childrenandparents)
UNION ALL
-- Recursive member definition
SELECT g.child, g.parent, Level + 1, hierarchypath + '->' + g.parent
FROM childrenandparents as g
INNER JOIN GroupMembers AS gm
ON gm.parentgroup = g.child

--Exclude if the hierarchypath text contains the 'parent' value with a '->' before and after it
where gm.hierarchypath not like '%->' + g.parent + '->%'
)

select left(hierarchypath, charindex('->', hierarchypath) - 1) as child, parentgroup, level, hierarchypath
from groupmembers
order by level desc
option(maxrecursion 5);

I can now rerun the CTE and order the results by Level descending. This gives the following results.

child      parentgroup level       hierarchypath
---------- ----------- ----------- --------------------
e          d           2           e->b->c->d
a          d           2           a->b->c->d
e          c           1           e->b->c
a          c           1           a->b->c
a          b           0           a->b
e          b           0           e->b

Now, the whole problem with circular references is that it doesn't matter how many levels you go down, they'll still keep adding to the hierarchy path. The results above tell us that the "legitimate" data only goes down to level value 2, so anything that uses up more levels than that is data containing a circular reference.

In the CTE definition earlier, you'll see I set a maxrecursion level of 5 (i.e. it'll query recursively five times before giving up and erroring). Therefore I can now change the CTE's definition a bit to make use of this. Instead of excluding records where the parent has already appeared in the hierarchy path, I now leave them in. However, instead, I use a different "where" clause to stop the recursion above a set value. The value I choose is greater than the maximum "legitimate" level I discovered above, but below the maxrecursion value.

WITH GroupMembers (Child, ParentGroup, Level, hierarchypath)
AS
(
-- Anchor member definition
SELECT g.child, g.parent, 0 AS Level, convert(varchar(max), g.child + '->' + g.parent) AS hierarchypath
FROM childrenandparents AS g
WHERE child not in (select parent from childrenandparents)
UNION ALL
-- Recursive member definition
SELECT g.child, g.parent, Level + 1, hierarchypath + '->' + g.parent
FROM childrenandparents as g
INNER JOIN GroupMembers AS gm
ON gm.parentgroup = g.child

where level < 5
)

select left(hierarchypath, charindex('->', hierarchypath) - 1) as child, parentgroup, level, hierarchypath
from groupmembers
order by level desc
option(maxrecursion 5);

This gives the following results.

child      parentgroup level       hierarchypath
---------- ----------- ----------- --------------------
e          d           5           e->b->c->d->b->c->d
a          d           5           a->b->c->d->b->c->d
a          c           4           a->b->c->d->b->c
e          c           4           e->b->c->d->b->c
e          b           3           e->b->c->d->b
a          b           3           a->b->c->d->b
a          d           2           a->b->c->d
e          d           2           e->b->c->d
e          c           1           e->b->c
a          c           1           a->b->c
a          b           0           a->b
e          b           0           e->b

Remember that the "legitimate" data only goes up to level 2, so the top rows in this result are the ones containing the circular references. I can simply examine the string in the hierarchypath column to see where the circular reference is occurring, enabling me to see which records in my source data need to be corrected. In this case, it shows clearly the pattern of b -> c -> d occurring twice, so it must be the d -> b child/parent relationship causing the problem. Looking at the contents of the ChildrenAndParents table confirms there is a record with d as the child and b as the parent, so I've now found my problem record.

Total article views: 16025 | Views in the last 30 days: 26
 
Related Articles
FORUM

Child and parent Relation

Child and parent Relation

FORUM

FOR XML EXPLICIT - One to many relationship at level 3 (3 = parent, 4 = child)

Having trouble rendering a parent/child relationship at level 3/4 respectively.

FORUM

Parent Child Calculations

Calculating a metric on the parent child hierarchy.

FORUM

Parent-child dimension alternative

I would like to have a parent-child hierarchy alternative

ARTICLE

Building Parent-Child Table Tree information

How many times have you wanted to know which child or grandchild records exists for a parent record?...

Tags
cte    
recursion    
t-sql    
 
Contribute

Join the most active online SQL Server Community

SQL knowledge, delivered daily, free:

Email address:  

You make SSC a better place

As a member of SQLServerCentral, you get free access to loads of fresh content: thousands of articles and SQL scripts, a library of free eBooks, a weekly database news roundup, a great Q & A platform… And it’s our huge, buzzing community of SQL Server Professionals that makes it such a success.

Join us!

Steve Jones
Editor, SQLServerCentral.com

Already a member? Jump in:

Email address:   Password:   Remember me: Forgotten your password?
Steve Jones