Find Orphan records in a table

  • I have a table which follows a hierarchy and I am trying to find the orphan records.

    The table has two columns:

    UnitID ParentUnitID

    631 0

    632 631

    633 632

    634 633

    635 633

    It appears UnitID 631 is deleted and so I am left with orphan records (632,633,634 and 635) which I need to delete. I have many other orphan records like these. Is there an easy way to identify the orphan records using SQL and delete them?

  • You could use a recursive cte for this.

    btw, in your sample data you have no orphans because UnitID 631 is the first one there. 😉

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Hi KS-321165,

    This will tell you which children's parents do not exist

    SELECT

    *

    FROM Table1 t1

    LEFT JOIN tt AS Table1 ON t1.ParentUnitID = t2.UnitID

    This will give you the records where a child does not have a parent:

    SELECT

    *

    FROM Table1 t1

    LEFT JOIN tt AS Table1 ON t1.ParentUnitID = t2.UnitID

    WHERE t2.UnitID IS NULL

    This will delete the records from the table which do not have parents:

    DELETE FROM Table1

    WHERE UnitID IN

    (

    SELECT

    *

    FROM Table1 t1

    LEFT JOIN tt AS Table1 ON t1.ParentUnitID = t2.UnitID

    WHERE t2.UnitID IS NULL

    )

    and here is where I think Sean meant you will have to use Recursive CTE, as you will have to run this last query a few times, because it can happen that a child you delete which does not have a parent may be a parent in itself, so you will have to run the query until no records are returned any longer

    Kind Regards

  • Thanks! I will try it out.

  • Sorry. I was just trying to show the relationship between records. Can you give an example of recursive cte to do this?

  • This is the script to create records in table:

    --===== If the test table already exists, drop it

    IF OBJECT_ID('TempDB..#tblUnit','U') IS NOT NULL

    DROP TABLE #tblUnit

    CREATE TABLE #tblUnit(

    [UnitID] [int] NOT NULL,

    [ParentUnitID] [int] NOT NULL)

    INSERT INTO #tblUnit (UnitID,ParentUnitID) VALUES (632,631);

    INSERT INTO #tblUnit (UnitID,ParentUnitID) VALUES (633,632);

    INSERT INTO #tblUnit (UnitID,ParentUnitID) VALUES (634,633);

    INSERT INTO #tblUnit (UnitID,ParentUnitID) VALUES (635,633);

    select * from #tblUnit

  • Sorry,

    I just saw in my haste I never tested the code I posted, there were a few errors as I was making use of a different table name than the one I was posting, and I was rushing it, my bad :hehe:

    I haven't used a CTE to delete records before, so if someone could post one to match with this example it would be great, but here is my solution, using a while loop, not sure what the performance difference would be between a while loop and a CTE, but it should do what you need, sorry for the previous crappy code, this one I tested:

    WHILE EXISTS(

    SELECT

    t1.UnitID

    FROM Table1 t1

    LEFT JOIN Table1 AS t2 ON t1.ParentUnitID = t2.UnitID

    WHERE t2.UnitID IS NULL)

    BEGIN

    DELETE

    FROM Table1

    WHERE UnitID IN

    (

    SELECT

    t1.UnitID

    FROM Table1 t1

    LEFT JOIN Table1 AS t2 ON t1.ParentUnitID = t2.UnitID

    WHERE t2.UnitID IS NULL

    )

    END

    and it will just run until there are none left

    ciao

  • Thanks!

    This does not work the way I want to.

    Also, I have an additional issue to consider and that is: All ParentIDs begin at 0 so they will not have a parent record of their own. So if I add two records for UnitID 1 (one as UnitID and the other as parent), these should not be deleted.

    My revised script is as follows:

    --===== If the test table already exists, drop it

    IF OBJECT_ID('TempDB..#tblUnit','U') IS NOT NULL

    DROP TABLE #tblUnit

    CREATE TABLE #tblUnit(

    [UnitID] [int] NOT NULL,

    [ParentUnitID] [int] NOT NULL)

    INSERT INTO #tblUnit (UnitID,ParentUnitID) VALUES (1,0);

    INSERT INTO #tblUnit (UnitID,ParentUnitID) VALUES (15,1);

    INSERT INTO #tblUnit (UnitID,ParentUnitID) VALUES (632,631);

    INSERT INTO #tblUnit (UnitID,ParentUnitID) VALUES (633,632);

    INSERT INTO #tblUnit (UnitID,ParentUnitID) VALUES (634,633);

    INSERT INTO #tblUnit (UnitID,ParentUnitID) VALUES (635,633);

    --select * from #tblUnit

    WHILE EXISTS(

    SELECT

    t1.UnitID

    FROM #tblUnit t1

    LEFT JOIN #tblUnit AS t2 ON t1.ParentUnitID = t2.UnitID

    WHERE t2.UnitID IS NULL)

    BEGIN

    select *

    FROM #tblUnit

    WHERE UnitID IN

    (

    SELECT

    t1.UnitID

    FROM #tblUnit t1

    LEFT JOIN #tblUnit AS t2 ON t1.ParentUnitID = t2.UnitID

    WHERE t2.UnitID IS NULL

    )

    END

    select * from #tblUnit

  • Sorry, the select in the middle of code should be replaced by DELETE as follows:

    --===== If the test table already exists, drop it

    IF OBJECT_ID('TempDB..#tblUnit','U') IS NOT NULL

    DROP TABLE #tblUnit

    CREATE TABLE #tblUnit(

    [UnitID] [int] NOT NULL,

    [ParentUnitID] [int] NOT NULL)

    INSERT INTO #tblUnit (UnitID,ParentUnitID) VALUES (1,0);

    INSERT INTO #tblUnit (UnitID,ParentUnitID) VALUES (15,1);

    INSERT INTO #tblUnit (UnitID,ParentUnitID) VALUES (632,631);

    INSERT INTO #tblUnit (UnitID,ParentUnitID) VALUES (633,632);

    INSERT INTO #tblUnit (UnitID,ParentUnitID) VALUES (634,633);

    INSERT INTO #tblUnit (UnitID,ParentUnitID) VALUES (635,633);

    --select * from #tblUnit

    WHILE EXISTS(

    SELECT

    t1.UnitID

    FROM #tblUnit t1

    LEFT JOIN #tblUnit AS t2 ON t1.ParentUnitID = t2.UnitID

    WHERE t2.UnitID IS NULL)

    BEGIN

    DELETE

    FROM #tblUnit

    WHERE UnitID IN

    (

    SELECT

    t1.UnitID

    FROM #tblUnit t1

    LEFT JOIN #tblUnit AS t2 ON t1.ParentUnitID = t2.UnitID

    WHERE t2.UnitID IS NULL

    )

    END

    select * from #tblUnit

  • Right,

    You will have to be a bit clearer about what you want to do, do you want to delete orphan records as in your original post or do you want to create a table which will show all the records with a hierarchy for the relationships? It seems like your question has morphed from the one to the other,

    The code with the While loop will delete orphan records, if your question is around the hierarchy view, how do you want to see the records?

    Thanks

  • Here is an example using a recursive CTE to get the list of IDs to delete. Please note that to your original data I removed 631. That left the sample table with only orphans and not in a very realistic state. To deal with that I added UnitID 1 and 3.

    if OBJECT_ID('tempdb..#Something') is not null

    drop table #Something

    create table #Something

    (

    UnitID int,

    ParentUnitID int

    )

    insert #Something

    select 1, 0 union all --this one is a "Root" level with no parents

    select 3, 1 union all

    select 632, 631 union all

    select 633, 632 union all

    select 634, 633 union all

    select 635, 633;

    with MyCTE as

    (

    select UnitID

    from #Something s

    where not exists (select UnitID from #Something s2 where s2.UnitID = s.ParentUnitID)

    and ParentUnitID > 0

    UNION ALL

    select s.UnitID

    from #Something s

    inner join MyCTE on MyCTE.UnitID = s.ParentUnitID

    )

    --if you just want to see what will be deleted just comment out the delete and

    --select * from MyCTE

    delete #Something

    from #Something s

    join MyCTE on s.UnitID = MyCTE.UnitID

    select * from #Something

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Thanks! I will check it out.:-)

  • Yes, the objective is to delete orphan records but in the example UnitID 1's record should not be deleted as their heirarchy is correct. You could have unit id 1 with ParentUnitID 0 and unitid 17 with parentUnitID 1.

  • Thanks Sean. It is just what is required.

    One more request:

    I am new to CTE.

    Do you have a CTE that will delete all the children when the parent is deleted automatically? So we don't get into this situation again?:-)

  • KS-321165 (5/9/2013)


    Thanks Sean. It is just what is required.

    One more request:

    I am new to CTE.

    Do you have a CTE that will delete all the children when the parent is deleted automatically? So we don't get into this situation again?:-)

    You could set it up a cascading FK. Then you don't need to run a CTE.

    You can't have a cascading delete on a self referencing foreign key.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

Viewing 15 posts - 1 through 15 (of 21 total)

You must be logged in to reply to this topic. Login to reply