Forming Closed Groups (Sets)

  • I have a situation where I need to form a set of closed groups from information held in a single table in SQL 2012. I know I can do this via a series of "while" loops but given that there could be several hundred thousand records in the table, this seems like an extremely bad idea, I'm also sure that there must be a way of doing this "properly" but at present the solution is eluding me.

    The table contains source data of a manager and a personal assistant, with a 3rd column that will be used to contain the group name, the complexity is that while a manager can have 1 or many personal assistants, a personal assistant can work for multiple managers and may have a personal assistant in their own right.

    My table looks like the following example

    Manager PersonalAssistant

    ======= =================

    A B

    A C

    A D

    B C

    D E

    F G

    F K

    H I

    H J

    H A

    ...

    The requirement is that I form a number of closed groups resulting in the following table

    Group Person

    ===== ======

    Group001 A

    Group001 B

    Group001 C

    Group001 D

    Group001 E

    Group001 H

    Group001 I

    Group001 J

    Group002 F

    Group002 G

    Group002 K

    ...

    Can anyone suggest how this can be implemented, or give me some guidance on where to start?

    The end result will be a stored procedure that can be run as-and-when it is necessary

    Thanks in advance....

  • Give this a shot...

    IF OBJECT_ID('tempdb..#MgtStructure', 'U') IS NOT NULL

    DROP TABLE #MgtStructure;

    CREATE TABLE #MgtStructure (

    EmployeeID CHAR(1),

    ManagerID CHAR(1)

    );

    INSERT#MgtStructure (ManagerID, EmployeeID) VALUES

    ('A', 'B'),

    ('A', 'C'),

    ('A', 'D'),

    ('B', 'C'),

    ('D', 'E'),

    ('F', 'G'),

    ('F', 'K'),

    ('H', 'I'),

    ('H', 'J'),

    ('H', 'A');

    WITH cte_Recursive AS (

    SELECT

    EmployeeID = Manager.ManagerID,

    ManagerID = CAST('' AS CHAR(1)),

    RecustionLevel = 1,

    GroupName = 'Group' + RIGHT('000' + CAST(ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS VARCHAR(3)), 3)

    FROM

    #MgtStructure Manager

    WHERE

    NOT EXISTS (SELECT 1 FROM #MgtStructure ms WHERE Manager.ManagerID = ms.EmployeeID)

    GROUP BY

    Manager.ManagerID

    UNION ALL

    SELECT

    Employee.EmployeeID,

    Employee.ManagerID,

    RecustionLevel = r.RecustionLevel + 1,

    r.GroupName

    FROM

    #MgtStructure Employee

    JOIN cte_Recursive r

    ON Employee.ManagerID = r.EmployeeID

    )

    SELECT

    r.GroupName,

    r.EmployeeID

    FROM

    cte_Recursive r

    ORDER BY

    r.EmployeeID

    ;

  • Jason, Many thanks, worked perfectly, just what was needed 🙂

  • Let input be as follows

    INSERT#MgtStructure (ManagerID, EmployeeID) VALUES

    --('A', 'B'),

    ('A', 'C'),

    ('A', 'D'),

    ('B', 'C'),

    ('D', 'E'),

    ('F', 'G'),

    ('F', 'K'),

    ('H', 'I'),

    ('H', 'J')

    --,('H', 'A')

    Which is correct group for C ?

    As OP stated, it's not a tree.

  • martyn.dowsett (12/16/2015)


    Jason, Many thanks, worked perfectly, just what was needed 🙂

    No problem. Glad to help.

  • serg-52 (12/16/2015)


    Let input be as follows

    INSERT#MgtStructure (ManagerID, EmployeeID) VALUES

    --('A', 'B'),

    ('A', 'C'),

    ('A', 'D'),

    ('B', 'C'),

    ('D', 'E'),

    ('F', 'G'),

    ('F', 'K'),

    ('H', 'I'),

    ('H', 'J')

    --,('H', 'A')

    Which is correct group for C ?

    As OP stated, it's not a tree.

    Just a guess, if we're talking "closed groups", does that mean C can only be in one group? So in that case, wouldn't there just be that many less groups generated? Obviously could be wrong here, interesting thread in any case!

  • According to OP statement

    martyn.dowsett (12/15/2015)


    the complexity is that while a manager can have 1 or many personal assistants, a personal assistant can work for multiple managers and may have a personal assistant in their own right.

    it looks as graph should be considered undirected. So it's really a problem of finding connected components of the graph https://en.wikipedia.org/wiki/Connected_component_(graph_theory).

    Just for fun one can try to reproduce well known algorithms in SQL but it hardly is the most suitable tool for the purpose.

  • Since a given assistant can answer to multiple managers, I'd imagine that they are allowed to appear in multiple groups.

    In either case it's a simple matter to choose a "top 1" group if a given assistant need to be tied to a single group.

    The only real danger is the existence of a "closed loop". Like this...

    INSERT#MgtStructure (ManagerID, EmployeeID) VALUES

    ('M', 'N'),

    ('N', 'O'),

    ('O', 'P'),

    ('P', 'M');

    Since there is no root node (an individual who doesn't exist as an assistant) the entire group will be ignored.

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply