Finding Hierarchical Groups From The Same Table

Question

Finding Hierarchical Groups From The Same Table

Joe Bloggs

Valued Member

Points: 62
More actions
January 23, 2004 at 10:09 am

#152021

SCENARIO: I have a table which will potentially have millions of rows. At the moment it has 624560 rows. Table format is:
CREATE TABLE NRGroups (ID_ int primary key, PID int, Grp int)
I am trying to process the rows in form of groups, e.g.
ID_ | PID | Grp
1     -     0
4     9     0
3     6     0
7     5     0
9     7     0
6     -     0
5     1     0
2     -     0
1, 5, 7, 9, 4 belong to one group, 6, 3 belong to next group, 2 is the only member in its group.
i.e.
ID_ | PID | Grp
1     -     1
4     9     1
3     6     2
7     5     1
9     7     1
6     -     2
5     1     1
2     -     3
ps: Not every row has a parent/child
I have written SQL Server code to organise the groups, with one nested loop inside - the nested loop finds the group and the outer loop gets the next record to find the group for.
PROBELM: When I limit the loop to process 10000s rows it sooner or later processes them but when I get it to process the lot i.e. 624560 rows it seems to take over an hour to put them into groups!! Is there anyway to speed this process??
CODE:
DECLARE     @iReturnCode       int,
            @iNextRowId        int,
            @iCurrentRowId     int,
            @iLoopControl      int,
            @count             int,
            @grp_num           int,
            @current           int,
            @lvl               tinyint
SELECT @iNextRowId = MIN(ID_) FROM NRGroups
SELECT @iLoopControl = 1
SELECT @grp_num = 0
WHILE @iLoopControl = 1 -- OUTER WHILE
   BEGIN
     ------------------FIND all group and UPDATE Grp field of NRGroups table----------------------------
        DECLARE @stack TABLE (item INT, lvl TINYINT)
        INSERT INTO @stack VALUES (@iNextRowId, 1)
        SELECT @lvl = 1
        SELECT @grp_num = @grp_num + 1
        WHILE @lvl > 0 -- INNER WHILE
           BEGIN
              IF EXISTS (SELECT * FROM @stack WHERE lvl = @lvl)
                 BEGIN
                    SELECT @current = item FROM @stack WHERE lvl = @lvl
      UPDATE NRGroups set Grp = @grp_num where id_ = @current
                    COMMIT
                    DELETE FROM @stack WHERE lvl = @lvl AND item = @current

                    INSERT @stack SELECT ID_, @lvl + 1 FROM NRGroups WHERE PID = @current

                    IF @@ROWCOUNT > 0
                       SELECT @lvl = @lvl + 1
                 END
              ELSE
                 SELECT @lvl = @lvl - 1
        END -- INNER WHILE
     --------------------end of FIND-------------------------------------------------------------------
     -- Reset looping variable
     SELECT @iCurrentRowId = @iNextRowId
     SELECT @iNextRowId = NULL
     -- get the next Row_Id
     SELECT @iNextRowId = MIN(ID_) FROM NRGroups WHERE ID_ > @iCurrentRowId and Grp = 0
     -- did we get a valid next row id?
     IF ISNULL(@iNextRowId, 0) = 0
       BREAK

     --if @iNextRowId > 1000 break -- DEBUG CODE
END-- OUTER WHILE
COMMIT

Viewing 7 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply

gljjr SSChampion Points: 10080 More actions · Answer 1

Now you know why I cringe everytime one of my developers wants to do this! If you put a limit on how many levels deep you will support then you can do this in a set based way... IE: Right now you go to 4 levels deep in your data so you should be able to do a set based self join 4 levels deep to get the lowest levels group. Then do it for 3 levels and 2 levels. Over all it will be much faster than the loop you have above. Only problem is now you are limited to only 4 levels. Another option might be to create a user defined function that calls itself to get the group. Again I think you will find that the performance will be hurt.

Gary Johnson
Microsoft Natural Language Group
DBA, Sr. DB Engineer

This posting is provided "AS IS" with no warranties, and confers no rights. The opinions expressed in this post are my own and may not reflect that of my employer.

gljjr SSChampion Points: 10080 More actions · Answer 2

Here is an example of what I mean.

SET NOCOUNT ON

IF EXISTS(SELECT * FROM sysobjects WHERE id = object_id('NRGroups'))

DROP TABLE NRGroups

CREATE TABLE NRGroups

(ID_ int primary key, PID int, Grp int)