The union in ancher members in cte is done before the execution of recursive mem

  • hi,

    1) in following query i have changed the second ancher member which was getting father to mother , thus i have created duplicate record, so i want to know that union of ancher memeber is done first and then recursive member is executed or after?

    2) similarly if i have only one acher memeber lests say mother and one recursive member lets say mother(grandmother) , then

    i put distinct in ancher member because due to some reason i have duplicate records there , so will the distinct be applied before the excution of recursive member or after

    any link would be helpfull.

    Genealogy table

    IF OBJECT_ID('dbo.Person','U') IS NOT NULL DROP TABLE dbo.Person;

    GO

    CREATE TABLE dbo.Person(ID int, Name VARCHAR(30), Mother INT, Father INT);

    GO

    INSERT dbo.Person

    VALUES(1, 'Sue', NULL, NULL)

    ,(2, 'Ed', NULL, NULL)

    ,(3, 'Emma', 1, 2)

    ,(4, 'Jack', 1, 2)

    ,(5, 'Jane', NULL, NULL)

    ,(6, 'Bonnie', 5, 4)

    ,(7, 'Bill', 5, 4);

    GO

    Create the recursive CTE to find all of Bonnie's ancestors.

    WITH Generation (ID) AS

    (

    First anchor member returns Bonnie's mother.

    SELECT Mother

    FROM dbo.Person

    WHERE Name = 'Bonnie'

    UNION

    Second anchor member returns Bonnie's father.

    SELECT mother

    FROM dbo.Person

    WHERE Name = 'Bonnie'

    UNION ALL

    First recursive member returns male ancestors of the previous generation.

    SELECT Person.Father

    FROM Generation, Person

    WHERE Generation.ID=Person.ID

    UNION ALL

    Second recursive member returns female ancestors of the previous generation.

    SELECT Person.Mother

    FROM Generation, dbo.Person

    WHERE Generation.ID=Person.ID

    )

    SELECT Person.ID, Person.Name, Person.Mother, Person.Father

    FROM Generation, dbo.Person

    WHERE Generation.ID = Person.ID;

    GO

     

    --yours sincerlly

  • In the example you gave, the reason you are NOT seeing a duplicate when you have mother in there twice instead of father is because you used UNION.  UNION will give you a distinct list.  UNION's are applied in order from top to bottom, so the UNION is handled prior to the UNION ALLs that fallow.  so your query is equivalent to having the first SELECT removed.

    So as an example:

    SELECT 1
    UNION
    SELECT 1
    UNION ALL
    SELECT 2

    would be handled as the first select gets executed, then the second one, then the results are sorted to remove duplicates, then the 3rd SELECT is executed and the results are added onto the result set giving us 2 rows of 1 and 2.  This is the same in a SELECT, a CTE, a recursive CTE, whatever.  Technically, in the back-end I don't think that is what happens, but it is effectively what is happening.  The optimizer MAY see the 2 queries being identical and strip one out too.

    Now, the second question, my understanding distinct is always applied at the end.  This can be verified by viewing the execution plan which will show you a "SORT (DISTINCT SORT)" operation occurs right before the SELECT.  So you can see that distinct is the last operation being performed prior to the SELECT finishing.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply