Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

Finding hierarchy errors Expand / Collapse
Author
Message
Posted Tuesday, February 18, 2014 7:22 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Today @ 4:27 AM
Points: 47, Visits: 357
Hi all,
I've read some posts by Jeff Moden and Gsquared on Hierarchy structures (very helpfull, thank you) an delving into TallyTables on the way. Very enlightning.
What I can't find, is a way to check for user input errors on these structures.
Consider the following data structure:

create table dbo.eqmas(eq_nmr VarChar(20), eq_parent VarChar(20), eq_desc VarChar(80))
INSERT INTO dbo.eqmas
SELECT 'OB01', Null, 'Object 1'
UNION ALL
SELECT 'OB02', 'OB1', 'Object 2'
UNION ALL
SELECT 'OB03', 'OB1', 'Object 1'
UNION ALL
SELECT 'OB04', 'OB2', 'Object 4'
UNION ALL
SELECT 'OB05', 'OB2', 'Object 5'
UNION ALL
SELECT 'OB06', 'OB10', 'Object 6'
UNION ALL
SELECT 'OB07', 'OB8', 'Object 8'
UNION ALL
SELECT 'OB08', 'OB9', 'Object 7'
UNION ALL
SELECT 'OB09', 'OB7', 'Object 7'

As you can see, all is hunkydor up to OB6, this has a non existing parent. This one is easily find with a left outer self join.
OB7 is child of OB8, is child of OB9 is child of OB7. This is a circular reference. This can be one to many deep. One level (self recurring) is easily identified with a eq_nmr = eq_parent select.
Currently I'm traversing the whole tree with a cursor for each record in the table. As the table grows, so goes the performance because of the RBAR (brrrr ).
When I use the following code on a broken production set, I get 'Maximum recursion level reached'
;with HierarchyCTE (Lvl, eq_nmr, eq_parent, eq_desc) as
(select 1, eq_nmr, eq_parent, eq_desc
from dbo.eqmas
union all
select Lvl + 1, h2.eq_nmr, h2.eq_parent, h2.eq_desc
from dbo.eqmas h2
inner join HierarchyCTE
on h2.eq_parent = HierarchyCTE.eq_nmr)
select Lvl, eq_nmr, eq_parent, eq_desc
from HierarchyCTE
order by eq_nmr

Can you suggest a better, faster, more efficient way to identify these errors, or is the cursor way the least bad way to do this.

I'm on SQL2005 Enterprise

Thanks,

Cees Cappelle

p.s.
Here's my current errorchecking code. Feel free to bash.

IF EXISTS (SELECT * FROM dbo.sysobjects WHERE id = object_id(N'mcmain.mcmsp_eqmascheck') AND OBJECTPROPERTY(id, N'IsProcedure') = 1)
DROP PROCEDURE mcmain.mcmsp_eqmascheck
GO

CREATE PROCEDURE mcmain.mcmsp_eqmascheck
(@plannummer VarChar(20) = '', @Version VarChar(5) = '' Output, @debug Bit = 0 ) --WITH RECOMPILE

/* 31-03-2005 CPE
1 mogelijke variabele @plannummer
Als deze niet wordt meegegeven, komt er geen fout, maar wordt de variabele leeg (NotNull)
-- Controle op fouten in eqmas
-- 1. Check op onbestaande parents
-- 2. Check op foute structuur (cycle)
18-03-2013 bhr-ceca v2.0 Uitgebreid met opzoeken systrace oorzaken
*/

AS
EXEC mcmain.mcmsp_translog @transnummer = @plannummer, @Omschrijving = 'Initialize Gestart mcmsp_eqmascheck', @mcm_user = '', @Tabel = 'eqmas', @Aktie = '', @Keyval = ''

-- Initialiseren
SET NOCOUNT ON -- Snelere verwerking zonder (23 row(s) affected)

-- Aanmaken interne variabelen
DECLARE @StorProc VarChar(254)
SET @StorProc = 'mcmsp_eqmascheck'
SET @Version = '2.0'

DECLARE @mcm_user Varchar(20)
DECLARE @mcm_craft VarChar(20)
DECLARE @mcm_group VarChar(20)
DECLARE @mcm_name VarChar(45)
DECLARE @mcm_site VarChar(20)

DECLARE @StrRowId Int

DECLARE @SysError TABLE(strObject VarChar(20), strUser VarChar(20), strAction VarChar(20), strDate DateTime, strSource VarChar(20), strRowId Int)

SET @mcm_user = ''
SET @mcm_craft = ''
SET @mcm_group = ''
SET @mcm_name = ''
SET @mcm_site = ''

EXEC mcmain.mcmsp_translog @transnummer = @plannummer, @Omschrijving = 'Aanmaken eqmaserror tabel', @mcm_user = @mcm_user, @Tabel = 'eqmaserror', @Aktie = 'CREATE', @Keyval = '', @StorProc = @StorProc
If @Debug = 0
BEGIN
IF EXISTS (SELECT * FROM dbo.sysobjects WHERE id = object_id(N'[mcmain].[eqmaserror]') AND OBJECTPROPERTY(id, N'IsUserTable') = 1)
DROP TABLE [mcmain].[eqmaserror]
CREATE TABLE [mcmain].[eqmaserror] (
[Object] VarChar(20), --[char] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
[ObjectOmschrijving] VarChar(45), -- [char] (45) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[OnderdeelVan] VarChar(20), --[char] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[Fout] VarChar(45), -- COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[FoutCode] VarChar(10), -- COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[ControleDatum] DateTime ,
[Sitecode] VarChar(20),
[Usercode] VarChar(32),
[UserAction] VarChar(45),
[UserDate] DateTime,
[UserSource] VarChar(45),
[UserSourceRowId] INT,
[CreateUsercode] VarChar(32),
[CreateUserDate] DateTime,
[er_rowid] [int] IDENTITY (1, 1) NOT NULL
) ON [PRIMARY]
END

EXEC mcmain.mcmsp_translog @transnummer = @plannummer, @Omschrijving = 'Opzoeken onbekende parents', @mcm_user = @mcm_user, @Tabel = 'eqmaserror', @Aktie = 'INSERT', @Keyval = '', @StorProc = @StorProc
IF @Debug = 0
BEGIN
INSERT INTO mcmain.eqmaserror(Object, ObjectOmschrijving, OnderdeelVan, Fout, ControleDatum, FoutCode, Sitecode)
SELECT eqmas.eq_nmr, eqmas.eq_desc,eqmas.eq_parent, 'Onbekende parent', GetDate(), '1', eqmas.eq_site
FROM mcmain.eqmas eqmas
LEFT OUTER JOIN mcmain.eqmas eqparent ON eqparent.eq_nmr = eqmas.eq_parent
WHERE eqparent.eq_nmr IS NULL and eqmas.eq_parent > ''
END

EXEC mcmain.mcmsp_translog @transnummer = @plannummer, @Omschrijving = 'Opzoeken cyclic parents (A->B->C->A)', @mcm_user = @mcm_user, @Tabel = 'eqmaserror', @Aktie = 'INSERT', @Keyval = '', @StorProc = @StorProc
IF @Debug = 0
BEGIN
-- DECLARE @Eqstruct Table(eqparent VarChar(20), eqchild VarChar(20)) -- Memory table om parent child relatie te berekenen
DECLARE DataSET Cursor FOR SELECT eq_nmr, eq_site FROM mcmain.eqmas
DECLARE @Eqparent VarChar(20)
, @Eqchild VarChar(20) -- wijzigt niet
, @EqSite VarChar(20) -- wijzigt niet
, @Eqnummer VarChar(20)
, @EqDesc VarChar(45)
, @Foutcode VarChar(10)
, @Eqlevel Int -- Current level eqmas record
, @EqMaxLevel Int -- Max level = Aantal Records

SET @Eqlevel = 0
SET @EqMaxLevel = (Select COUNT(eq_nmr) from mcmain.eqmas)

OPEN Dataset
FETCH NEXT FROM DataSET INTO @Eqnummer, @EqSite
SET @Eqchild = @Eqnummer
SELECT @EqParent = eq_parent, @EqDesc = eq_desc FROM mcmain.eqmas
WHERE eq_nmr = @EqNummer
--SET @Eqparent = (Select eq_parent from mcmain.eqmas where eq_nmr = @Eqnummer)
--SET @EqDesc = (Select eq_desc from mcmain.eqmas where eq_nmr = @Eqnummer)
WHILE (@@FETCH_STATUS = 0)
BEGIN
SET @Eqlevel = 0
While Rtrim(@Eqparent) > ''
Begin
-- INSERT INTO @eqstruct(eqparent, eqchild)
-- Values(@eqparent, @eqchild)
SET @EqNummer = @Eqparent
SELECT @Eqparent = eq_parent FROM mcmain.eqmas WHERE eq_nmr = @Eqnummer
SET @Eqlevel = @EqLevel + 1
IF @EqLevel >= @EqMaxLevel
BEGIN
INSERT INTO mcmain.eqmaserror(Object, ObjectOmschrijving, OnderdeelVan, Fout, ControleDatum, FoutCode, Sitecode)
Values(@Eqchild,@EqDesc,@EqParent,'Cyclic Parent gevonden', GetDate(), '2', @EqSite)
-- PRINT 'Fout '+@EqParent+' '+@Eqchild+' '+@Eqnummer
SET @EqParent = ''
END
End
FETCH NEXT FROM DataSET INTO @Eqnummer, @EqSite
SELECT @EqDesc = eq_desc FROM mcmain.eqmas WHERE eq_nmr = @Eqnummer
SET @Eqchild = @Eqnummer
SET @Eqparent = @Eqnummer
END

Close Dataset
Deallocate Dataset

END

-- Als er geen fouten gevonden zijn, kan de eqmaserror tabel weer weg.
If @Debug = 0
BEGIN
IF NOT EXISTS (SELECT * FROM mcmain.eqmaserror)
BEGIN
EXEC mcmain.mcmsp_translog @transnummer = @plannummer, @Omschrijving = 'Verwijderen lege eqmaserror', @mcm_user = @mcm_user, @Tabel = 'eqmaserror', @Aktie = 'INSERT', @Keyval = '', @StorProc = @StorProc
DROP TABLE [mcmain].[eqmaserror]
END
ELSE
BEGIN -- Opzoeken geschatte redenen (18-03-2013)
DECLARE DataSET Cursor FOR SELECT [OnderdeelVan], [SiteCode], [FoutCode] FROM mcmain.eqmaserror --WHERE [FoutCode] = '1'
OPEN Dataset
FETCH NEXT FROM Dataset INTO @EqParent, @EqSite, @FoutCode
WHILE (@@FETCH_STATUS = 0)
BEGIN
IF @Foutcode = '1'
BEGIN -- Opzoeken gedelete records
INSERT INTO @SysError(strObject, strUser, strAction, strDate, strSource, strRowId)
SELECT TOP 1 str_keyval, str_usr, str_type, str_date, 'Systrace', str_rowid
FROM mcmain.systrace
WHERE str_keyval = @EqParent AND str_type = 'UDL'
ORDER BY str_date
--SELECT @@RowCount
IF @@RowCount = 0
BEGIN
INSERT INTO @SysError(strObject, strUser, strAction, strDate, strSource, strRowId)
SELECT TOP 1 str_keyval, str_usr, str_type, str_date, 'Systrace', str_rowid
FROM McSystem.mcmain.systrace
WHERE str_keyval = @EqParent AND str_site = @EqSite AND str_type = 'UDL'
ORDER BY str_date
END
END

IF @Foutcode = '2'
BEGIN
INSERT INTO @SysError(strObject, strUser, strAction, strDate, strSource, strRowId)
SELECT TOP 1 str_keyval, str_usr, str_type, str_date, 'Systrace', str_rowid
FROM mcmain.systrace WHERE str_keyval = @EqParent and str_site = @EqSite
and str_desc like '%UPDATE%' and str_desc like '%eq_parent%'
order by str_date
END
IF @@RowCount = 0
BEGIN
INSERT INTO @SysError(strObject, strUser, strAction, strDate, strSource, strRowId)
SELECT TOP 1 str_keyval, str_usr, str_type, str_date, 'McSystem', str_rowid
FROM McSystem.mcmain.systrace WHERE str_keyval = @EqParent and str_site = @EqSite
and str_desc like '%UPDATE%' and str_desc like '%eq_parent%'
order by str_date
--select @EqParent, @EqSite, @@RowCount
END

UPDATE mcmain.eqmaserror
SET UserCode = strUser
, UserAction = strAction
, UserDate = strDate
, UserSource = strSource
, UserSourceRowId = strRowId
FROM mcmain.eqmaserror
INNER JOIN @SysError ON strObject = [OnderdeelVan] AND FoutCode = @FoutCode

FETCH NEXT FROM Dataset INTO @EqParent, @EqSite, @FoutCode
DELETE @SysError
END
Close Dataset
Deallocate Dataset

--SELECT * FROM @SysError

UPDATE mcmain.eqmaserror
SET UserCode = eq_chuser
-- , UserAction = strAction
, UserDate = eq_chdate
, UserSource = 'eqmas'
, UserSourceRowId = eq_rowid
FROM mcmain.eqmaserror
INNER JOIN mcmain.eqmas ON eq_nmr = [OnderdeelVan] AND eq_chdate > '' and Userdate is Null

UPDATE mcmain.eqmaserror
SET CreateUserCode = eq_chuser
, CreateUserDate = eq_chdate
FROM mcmain.eqmaserror
INNER JOIN mcmain.eqmas ON eq_nmr = Object

EXEC mcmain.mcmsp_translog @transnummer = @plannummer, @Omschrijving = 'Tonen foute records', @mcm_user = @mcm_user, @Tabel = 'eqmaserror', @Aktie = 'SELECT', @Keyval = '', @StorProc = @StorProc
SELECT * FROM [mcmain].[eqmaserror]
END
END


-- Finaliseren
EXEC mcmain.mcmsp_translog @transnummer = @plannummer, @Omschrijving = 'Finaliseren', @mcm_user = @mcm_user, @Tabel = '', @Aktie = '', @Keyval = '', @StorProc = @StorProc


EXEC mcmain.mcmsp_translog @transnummer = @plannummer, @Omschrijving = 'Eqmas controle gereed', @mcm_user = @mcm_user, @Tabel = '', @Aktie = '', @Keyval = '', @StorProc = @StorProc
GO


Post #1542562
Posted Tuesday, February 18, 2014 8:12 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 9:48 AM
Points: 35,950, Visits: 30,232
Once you have all of the problems resolved with the "non-existing parent" problem, you should add a self-referencing FK that contrains the values of parentIDs to only those values available as child IDs.

On the "circular reference problem", I'm getting ready for work and can't demo the code just now. If someone doesn't get to it (would be done during the recurrsive CTE to build the sort path) before I do later tonight, I'll see if I can explain it with some code.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

"Change is inevitable. Change for the better is not." -- 04 August 2013
(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1542598
Posted Wednesday, February 19, 2014 5:15 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Today @ 4:27 AM
Points: 47, Visits: 357
I'll look into the FK thingie. Would be nice if you could point me in the right direction to modify my sp to CTE instead of cursors.
Post #1542931
Posted Wednesday, February 19, 2014 8:53 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 9:48 AM
Points: 35,950, Visits: 30,232
Cees Cappelle-442904 (2/19/2014)
I'll look into the FK thingie. Would be nice if you could point me in the right direction to modify my sp to CTE instead of cursors.


Apologies for the delays. I've had a pretty full dance card for the last couple of days but I have been working on some code for you. It's actually simple code but I test the hell out of these things before I publish them.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

"Change is inevitable. Change for the better is not." -- 04 August 2013
(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1543315
Posted Thursday, February 20, 2014 8:38 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 9:48 AM
Points: 35,950, Visits: 30,232
Ok... Here's your table with some modified data to make a couple of points.

--  DROP TABLE dbo.eqmas
go
CREATE TABLE dbo.eqmas
(eq_nmr VARCHAR(20), eq_parent VARCHAR(20), eq_desc VARCHAR(80))
;
INSERT INTO dbo.eqmas
SELECT 'OB01', Null , 'Object 1' UNION ALL
SELECT 'OB02', 'OB01', 'Object 2' UNION ALL --Circular
SELECT 'OB03', 'OB01', 'Object 3' UNION ALL
SELECT 'OB04', 'OB02', 'Object 4' UNION ALL
SELECT 'OB05', 'OB02', 'Object 5' UNION ALL --Circular
SELECT 'OB11', 'OB05', 'Object 11' UNION ALL --Circular
SELECT 'OB02', 'OB11', 'Object 2' UNION ALL --Circular -- This is the cause
SELECT 'OB06', 'OB10', 'Object 6' UNION ALL --Stranger (is not a child)
SELECT 'OB07', 'OB08', 'Object 7' UNION ALL --Orphan (Island)
SELECT 'OB08', 'OB09', 'Object 8' UNION ALL --Orphan (Island)
SELECT 'OB09', 'OB07', 'Object 9' --Orphan (Island)
;


I suppose someone could do this all in one query but I wanted to keep the hierarcy "builder" separate from the code that does the troubleshooting. If any rows show up in the trouble shooting section of the code, then you need to make a fix in the data.

/*=============================================================================
This will find the "orphans" including the "tail" of any circular references.
Note that an orphaned group of nodes that contains a circular reference
usually won't cause a "runaway".
=============================================================================*/
--===== This part builds the hierarchy with as much data as it can stopping
-- short of "runaway" circular references.
IF OBJECT_ID('tempdb..#Found','U') IS NOT NULL
DROP TABLE #Found
;
WITH
cteDirectReports AS
(
SELECT eq_nmr, eq_parent, Lvl = 1,
HierarchicalPath = CAST('\'+eq_nmr+'\' AS VARCHAR(MAX))
FROM dbo.eqmas
WHERE eq_parent IS NULL
UNION ALL
SELECT t.eq_nmr, t.eq_parent, Lvl = c.Lvl + 1,
HierarchicalPath = CAST(c.HierarchicalPath+t.eq_nmr+'\' AS VARCHAR(MAX))
FROM dbo.eqmas t
INNER JOIN cteDirectReports c ON t.eq_parent = c.eq_nmr
WHERE c.HierarchicalPath NOT LIKE '%\'+ t.eq_nmr+'\%'
)
SELECT NodeNumber = ROW_NUMBER()OVER(ORDER BY HierarchicalPath)
,eq_nmr, eq_parent, Lvl, HierarchicalPath
INTO #Found
FROM cteDirectReports
ORDER BY HierarchicalPath
;
--===== Display the sorted Adjacency List of what "fit" in the hierarchy
SELECT *
FROM #Found
ORDER BY NodeNumber
;
--===== Find all the problems with the Adjacency List
WITH
cteOrphans AS
( --=== This "subtracts" the nodes that were found above from all the nodes
-- available to return "orphans".
SELECT eq_nmr, eq_parent FROM dbo.eqmas
EXCEPT
SELECT eq_nmr, eq_parent FROM #Found
)
SELECT o.*, f.*
,Problem = CASE
WHEN f.HierarchicalPath LIKE '%\'+o.eq_nmr+'\%' THEN 'Circular Cause'
WHEN s.eq_nmr IS NULL THEN 'Stranger'
WHEN f.eq_nmr IS NULL THEN 'Orphan'
ELSE '???????'
END
FROM cteOrphans o
LEFT JOIN #Found f ON o.eq_parent = f.eq_nmr
LEFT JOIN dbo.eqmas s ON o.eq_parent = s.eq_nmr
ORDER BY o.eq_nmr
;

Here's the output of the #Found table.

NodeNumber eq_nmr eq_parent Lvl HierarchicalPath
---------- ------ --------- --- ---------------------
1 OB01 NULL 1 \OB01\
2 OB02 OB01 2 \OB01\OB02\
3 OB04 OB02 3 \OB01\OB02\OB04\
4 OB05 OB02 3 \OB01\OB02\OB05\
5 OB11 OB05 4 \OB01\OB02\OB05\OB11\
6 OB03 OB01 2 \OB01\OB03\

(6 row(s) affected)



And, here's the output of the "problem finder" code. I'll let you rename the duplicate column names as you see fit.

eq_nmr eq_parent NodeNumber eq_nmr eq_parent Lvl  HierarchicalPath      Problem       
------ --------- ---------- ------ --------- ---- --------------------- --------------
OB02 OB11 5 OB11 OB05 4 \OB01\OB02\OB05\OB11\ Circular Cause
OB06 OB10 NULL NULL NULL NULL NULL Stranger
OB07 OB08 NULL NULL NULL NULL NULL Orphan
OB08 OB09 NULL NULL NULL NULL NULL Orphan
OB09 OB07 NULL NULL NULL NULL NULL Orphan

(5 row(s) affected)



To be honest, I've not tested it for performance but it should do pretty well especially if you have a clustered index in the eq_parent column. Try it with the indexes you have, first.

I'll be back on the FK thing in a couple of minutes.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

"Change is inevitable. Change for the better is not." -- 04 August 2013
(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1543796
Posted Thursday, February 20, 2014 9:05 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 9:48 AM
Points: 35,950, Visits: 30,232
Here's a classic example of an "Employee" organizational chart table. It has the self referencing FK that I'm talking about where the "ManagerID" must be found as an "EmployeeID" except when the "ManagerID" is NULL. Such a thing will prevent "strangers" and the PK on the "EmployeeID" column will prevent most circular references not to mention "cross branching" where an employee has more than 1 position.

--===== Create the test table with a clustered PK and an FK to itself to make
-- sure that a ManagerID is also an EmployeeID.
CREATE TABLE dbo.Employee
(
EmployeeID INT NOT NULL,
ManagerID INT NULL,
EmployeeName VARCHAR(10) NOT NULL,
CONSTRAINT PK_Employee
PRIMARY KEY CLUSTERED (EmployeeID),
CONSTRAINT FK_Employee_Employee
FOREIGN KEY (ManagerID)
REFERENCES dbo.Employee (EmployeeID)
ON UPDATE NO ACTION
ON DELETE NO ACTION
)
;


Of course, none of that will work until you iron out the bugs in the data. Once thats, done, add the PK and the FK I mentioned and a whole lot of problems will be prevented. There are other reasonable checks that you can add like making sure that the child column isn't equal to the parent column for any given row.

Of course, those won't help with "orphans" so you'll still need to do something like I did in the "Problem checking code" in the post above this one.

Shifting gears a bit, the Adjacency List (parent/child) type of hierarchy is comparatively easy for humans to fix because each node is aware of one and only one other node. The "Hierarchical Path" hierarchy that I also formed (and is similar to the HierarchyID datatype, which I never use) has it's uses but is a bit difficult to use for anything practical. Nested Sets are really fast and sortable on the "Left Bower" but, like Hierarchical Paths, are a real PITA to maintain because each node is aware of many other nodes.

With that thought in mind, you can take advantage of the advantages of each. Please see the following article for how that can be done.
http://www.sqlservercentral.com/articles/Hierarchy/94040/

On the other hand, there's normally a pretty finite limit as to what people actually want to get out of hierarchies. The following article explains how to solve for all of those in a kind of new, pre-aggregated hierarchical structure. MLM'ers love this method.
http://www.sqlservercentral.com/articles/T-SQL/94570/



--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

"Change is inevitable. Change for the better is not." -- 04 August 2013
(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1543800
Posted Friday, February 21, 2014 12:34 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Today @ 4:27 AM
Points: 47, Visits: 357
Jeff, wonderful post, very helpful.
I'll go study your example code and see what comes up.
Thanks for the reference to hierarchy studies, I'll put it in my reading/todo list.

Tried your code and it works like a charm.
My SP takes 2.5 secs on a small production db, your code 0.085 with 1 instead of two roundtrips.
On my largest one (61616 eqmas records) my full sp 48.2 sec , 2 roundtrips
Your code 7 roundtrips, 33.2 sec

I'll integrate your code in my sp

Cees Cappelle
Post #1543850
Posted Friday, February 21, 2014 2:18 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Today @ 4:27 AM
Points: 47, Visits: 357
Sorry to be a bother, but seem to have found an issue with your code.
You state:
WHEN f.HierarchicalPath LIKE '%\'+o.eq_nmr+'\%' THEN 'Circular Cause'

but f references #found with a
  WHERE c.HierarchicalPath NOT LIKE '%\'+ t.eq_nmr+'\%'

Can't get it to work like I expect. I've added a few more lines to the data:
 INSERT INTO dbo.eqmas
SELECT 'OB01', Null , 'Object 1' UNION ALL
SELECT 'OB02', 'OB01', 'Object 2' UNION ALL --Circular
SELECT 'OB03', 'OB01', 'Object 3' UNION ALL
SELECT 'OB04', 'OB02', 'Object 4' UNION ALL
SELECT 'OB05', 'OB02', 'Object 5' UNION ALL --Circular
SELECT 'OB11', 'OB05', 'Object 11' UNION ALL --Circular
SELECT 'OB02', 'OB11', 'Object 2' UNION ALL --Circular -- This is the cause
SELECT 'OB06', 'OB10', 'Object 6' UNION ALL --Stranger (is not a child)
SELECT 'OB07', 'OB08', 'Object 7' UNION ALL --Orphan (Island)
SELECT 'OB08', 'OB09', 'Object 8' UNION ALL --Orphan (Island)
SELECT 'OB09', 'OB07', 'Object 9' UNION ALL --Orphan (Island)
SELECT 'TSTOB222 ','TSTOB21 ','' UNION ALL
SELECT 'TSTOB221 ','TSTOB22 ','' UNION ALL
SELECT 'TSTOB2 ','TSTOB222' ,''
;

This gives me:
eq_nmr	eq_parent	NodeNumber	eq_nmr	eq_parent	Lvl	HierarchicalPath	Problem
OB02 OB11 5 OB11 OB05 4 \OB01\OB02\OB05\OB11\ Circular Cause
OB06 OB10 NULL NULL NULL NULL NULL Stranger
OB07 OB08 NULL NULL NULL NULL NULL Orphan
OB08 OB09 NULL NULL NULL NULL NULL Orphan
OB09 OB07 NULL NULL NULL NULL NULL Orphan
TSTOB2 TSTOB222 NULL NULL NULL NULL NULL Orphan
TSTOB221 TSTOB22 NULL NULL NULL NULL NULL Stranger
TSTOB222 TSTOB21 NULL NULL NULL NULL NULL Stranger

OB2 gives me a Circular, but is goes from OB2-OB1, that's a normal Child-Parent.
TSTOB2 gives me an Orphan, but it's a circular.

Can you help?
Post #1543866
Posted Friday, February 21, 2014 8:09 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 9:48 AM
Points: 35,950, Visits: 30,232
I have to get ready for work but let me throw a couple of things your way.

The code found all of the problems and now is a matter of interpretation for the human. To explain...

The "group" of 3 rows that begins with "TS" isn't a circular reference because 2 of the 3 rows contain "strangers" in the parent column. That is, they contain IDs in the parent column that are not contained in the child column. The 3rd one is truly an "orphan" because there's no path back to the root. So,

The "group" of 3 rows that contain OB07, OB08, and OB09 are circular but the code can't get to that group to make that determination because of the larger problem... there's no path back to the root from this group, which is the definition of an "orphan". So, the group actually has 2 problems, the most important of which is that they're "orphans".

OB2 gives me a Circular, but is goes from OB2-OB1, that's a normal Child-Parent.



To be semantically correct, there is no OB2 or OB1. There's an OB02 and an OB01. Hierarchies are confusing enough without leaving characters out of the names. I suspect that typing problem is also the cause of why you thought the "TS" group was circular instead of 2 of them being "strangers".

That, not withstanding, I had comment-marked that row as "circular" because it's part of a circular path and not necessarily a part of the "loop". Ignore that comment. The BIG problem here is the OB02/OB11 row and the code found it. It not only causes a circular reference but it also causes "cross branching" which is why the number of rows associated with it "exploded". The problem with "cross branching" will also cause explosive duplication of rows in the #Found table, as well.

The next step in trying to repair the data would be to determine what the correct parent of all those rows are, fix them, and do another run until no rows appear in the "problems listing". Of course, that would mean getting more information from the people that "designed" this hierarchy or removing the bad rows as "not resolvable".


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

"Change is inevitable. Change for the better is not." -- 04 August 2013
(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1544019
Posted Tuesday, March 04, 2014 4:59 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Today @ 4:27 AM
Points: 47, Visits: 357
Thank you, Jeff. I've been away a few days, sorry for the late reply.
Yes, the typo's are my bad. I'll study some more on your explanations.
Post #1547249
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse