• andrew.norris 90251 (7/3/2015)


    Simple example. You have a parent child relationship in your data that has a ragged child hierarchy. You need to use different logic on different types of parent records and then on different types of child records that are related to that parent, and even based on the number of child records you find the logic to be executed needs to be different. The non set based logic of a cursor makes debugging easier as you can very easily step through the logic, rather than simply looking at the result of a set operation (or more likely a number of set operations)

    While it will be possible to do with set based operations and numerous temp tables - in some circumstances cursors provide a better mechanism, not sure on speed. But clearly if a cursor is perfoming in a 'performant' manner from a user perspective and resources are not constrained on the server, then what is the issue. I have also seen set operations which perform poorly, so let's not necessarily blame the construct.

    Again I am not sure why Oracle can make their PL/SQL using cursors as fast as set based operations and MS cannot. Cursors are not evil, and not bad, just a different method of processing data.

    Just my opinion, and I will use them where I think there is an advantage in doing so.

    Cheers

    First, I agree that MS could have done a better job in its rendition of cursors. Since you've said it twice, I also agree that Oracle did cursors better but good set-based code still crushes equivalent code even there. So let's move on because we can't change what MS did and what Oracle does has nothing to do with what MS did. 😀

    I also specifically asked for something other than a hierarchical example for where cursor usage is valuable because it's been proven many times that hierarchies are one place where most people over-look set based methods and think the only way to solve such things quickly and easily is through the use of cursors and other forms of RBAR. Here's a quote of that post.

    Jeff Moden (7/3/2015)


    andrew.norris 90251 (6/7/2015)


    Nothing wrong with cursors they have their place and are very useful.

    Do you have examples for [font="Arial Black"]something other hierarchies [/font]and bin stacking problems?

    Since you brought up hierarchies, though, let's use that as an example. The stored procedure near the end of this post will build a "perfect" Adjacency List hierarchy (meaning "no cycles" in the hierarchy). Your challenge, if you decide to accept it, is to create a result set from that Adjacency list that does the following.

    For each EmployeeID...

    1. List the EmployeeID

    2. Produce a grand total row for each employee consisting of the following:

    2.1. The hierarchical Level of the employee compared to the root node.

    2.2. The hierarchical Level of the employee compared to their self.

    2.3. The total number of employees in the "downline" of that employee including that employee.

    2.4. The total amount of sales in the "downline" of that employee including that employee.

    3. Produce a sub-total row for each level in that employee's "downline" consisting of the following:

    3.1. The hierarchical Level of that level compared to the root node.

    3.2. The hierarchical Level of that level compared to the employee that is at the top (the "top" being the EmployeeID from #1 above) of that particular "downline".

    3.3. The total number of employees in that level of that particular "downline"

    3.4. The total sales of all employees in that level of that particular "downline".

    In other words, give a shorter example of an Adjacency List like this (the Tan area being just one employee {Bob} and his downline)...

    ... produce a result set that looks like the following (the colored rows match the colored nodes in the org chart above)...

    You use a cursor. I'll do something else. We'll test against the smaller org chart to ensure the code works correctly and then well run it against the table the attached proc builds and see if we actually need a cursor to do this or not and which might be faster. I'll also admit that I've never tried something like this with a cursor, so it's VERY possible that you might be teaching me a great lesson here.

    Are you up for it?

    Here's the code to build the small table according to the org chart above...

    --===== Do this in a nice, safe place that everyone has

    USE tempdb;

    GO

    CREATE PROCEDURE dbo.BuildSmallEmployeeTable AS

    /****************************************************************************

    Purpose:

    Create a standard "well formed" Adjacency List hierarchy with indexes.

    This procedure takes no parameters.

    Usage:

    EXEC dbo.BuildSmallEmployeeTable;

    Revision History:

    Initial creation - Circa 2009 - Jeff Moden

    Rev 01 - 15 May 2010 - Jeff Moden

    - Abort if current DB isn't "tempdb" to protect users.

    ****************************************************************************/

    --===========================================================================

    -- Presets

    --===========================================================================

    --===== Supresss the autodisplay of rowcounts to cleanup the display and to

    -- prevent false error returns if called from a GUI.

    SET NOCOUNT ON;

    --===== Make sure that we're in a safe place to run this...

    IF DB_NAME() <> N'tempdb'

    BEGIN

    RAISERROR('Current DB is NOT tempdb. Run aborted.',11,1);

    RETURN;

    END;

    --===== Conditionaly drop the test table so we can do reruns more easily

    IF OBJECT_ID('tempdb.dbo.Employee','U') IS NOT NULL

    DROP TABLE tempdb.dbo.Employee;

    --===========================================================================

    RAISERROR('Building the hierarchy table...',0,1) WITH NOWAIT;

    --===========================================================================

    --===== Create the test table with a clustered PK and an FK to itself to make

    -- sure that a ManagerID is also an EmployeeID.

    CREATE TABLE dbo.Employee

    (

    EmployeeID INT NOT NULL,

    ManagerID INT NULL,

    EmployeeName VARCHAR(10) NOT NULL,

    CONSTRAINT PK_Employee

    PRIMARY KEY CLUSTERED (EmployeeID),

    CONSTRAINT FK_Employee_Employee

    FOREIGN KEY (ManagerID)

    REFERENCES dbo.Employee (EmployeeID)

    ON UPDATE NO ACTION

    ON DELETE NO ACTION

    )

    ;

    --===========================================================================

    RAISERROR('Populate the hierarchy table...',0,1) WITH NOWAIT;

    --===========================================================================

    --===== Populate the test table with test data. Each child ID has a parent ID

    -- adjacent to it on the same row which is why it's called an "Adjacency

    -- List".

    INSERT INTO dbo.Employee

    (EmployeeID, ManagerID, EmployeeName)

    SELECT 26,NULL,'Jim' UNION ALL

    SELECT 2, 26,'Lynne' UNION ALL

    SELECT 3, 26,'Bob' UNION ALL

    SELECT 6, 17,'Eric' UNION ALL

    SELECT 8, 3,'Bill' UNION ALL

    SELECT 7, 3,'Vivian' UNION ALL

    SELECT 12, 8,'Megan' UNION ALL

    SELECT 14, 8,'Kim' UNION ALL

    SELECT 17, 2,'Butch' UNION ALL

    SELECT 18, 39,'Lisa' UNION ALL

    SELECT 20, 3,'Natalie' UNION ALL

    SELECT 21, 39,'Homer' UNION ALL

    SELECT 39, 26,'Ken' UNION ALL

    SELECT 40, 26,'Marge'

    ;

    RAISERROR('There are %u rows in the hierarchy.',0,1,@@ROWCOUNT) WITH NOWAIT;

    --===========================================================================

    RAISERROR('Adding an additional index ...',0,1) WITH NOWAIT;

    --===========================================================================

    --===== Create an additional index to speed things up

    CREATE UNIQUE INDEX By_ManagerID_EmployeeID

    ON TempDB.dbo.Employee (ManagerID,EmployeeID)

    ;

    --===========================================================================

    -- Exit

    --===========================================================================

    RAISERROR('===============================================',0,1) WITH NOWAIT;

    RAISERROR('RUN COMPLETE',0,1) WITH NOWAIT;

    RAISERROR('===============================================',0,1) WITH NOWAIT;

    GO

    ... and here's the code to build the large one I spoke of....

    --===== Do this in a nice, safe place that everyone has

    USE tempdb;

    GO

    CREATE PROCEDURE dbo.BuildLargeEmployeeTable

    /****************************************************************************

    Purpose:

    Create a randomized "well formed" Adjacency List hierarchy with indexes.

    Progammer's Notes:

    1. Each EmployeeID (except for the Root Node, of course) is assigned a

    random ManagerID number which is initially always less than the current

    EmployeeID to ensure that no cycles occur in the hierarcy.

    2. The second parameter used to call this stored procedure will optionally

    randomize the EmployeeIDss to make the hierarchy truly random as it would

    likely be in real life. This, of course, takes a small amounnt of extra

    time.

    3. This code runs nasty fast and is great for testing hierarchical

    processing code. Including the index builds, this code will build a

    million node Adjacency List on a 4 processor (i5) laptop with 6GB of RAM

    in just several seconds. The optional randomization adds just several

    more seconds.

    Usage:

    --===== Create the hierarchy where all the ManagerIDs are less than the

    -- EmployeeIDs. This is the fastest option and will build a million node

    -- hierarchy in just about 7 seconds on a modern machine.

    EXEC dbo.BuildLargeEmployeeTable 1000000;

    --===== Making the second parameter a non-zero value will further randomize

    -- the IDs in the hierarchy. This, of course, takes extra time and will

    -- build a million row hierarchy in about 17 seconds on a modern

    -- machine.

    EXEC dbo.BuildLargeEmployeeTable 1000000,1;

    Revision History:

    Initial concept and creation - Circa 2009 - Jeff Moden

    Rev 01 - 15 May 2010 - Jeff Moden

    - Abort if current DB isn't "tempdb" to protect users.

    Rev 02 - 13 Oct 2012 - Jeff Moden

    - Add a randomization stop to make the hierarchy more like real life.

    ****************************************************************************/

    --===== Declare the I/O parameters

    @pRowsToBuild INT,

    @pRandomize TINYINT = 0

    AS

    --===========================================================================

    -- Presets

    --===========================================================================

    --===== Supresss the autodisplay of rowcounts to cleanup the display and to

    -- prevent false error returns if called from a GUI.

    SET NOCOUNT ON;

    --===== Make sure that we're in a safe place to run this...

    IF DB_NAME() <> N'tempdb'

    BEGIN

    RAISERROR('Current DB is NOT tempdb. Run aborted.',11,1);

    RETURN;

    END;

    --===== Conditionaly drop the test table so we can do reruns more easily

    IF OBJECT_ID('tempdb.dbo.Employee','U') IS NOT NULL

    DROP TABLE tempdb.dbo.Employee;

    --===========================================================================

    RAISERROR('Building the hierarchy...',0,1) WITH NOWAIT;

    --===========================================================================

    --===== Build the test table and populate it on the fly.

    -- Everything except ManagerID is populated here. The code uses a

    -- technique called a "Psuedo-Cursor" (kudos to R. Barry Young for the

    -- term) to very quickly and easily build large numbers of rows.

    SELECT TOP (@pRowsToBuild)

    EmployeeID = ISNULL(CAST(

    ROW_NUMBER() OVER (ORDER BY (SELECT 1))

    AS INT),0),

    ManagerID = CAST(NULL AS INT),

    EmployeeName = CAST(NEWID() AS VARCHAR(36))

    INTO TempDB.dbo.Employee

    FROM master.sys.all_columns ac1

    CROSS JOIN master.sys.all_columns ac2

    CROSS JOIN master.sys.all_columns ac3

    ;

    RAISERROR('There are %u rows in the hierarchy.',0,1,@@ROWCOUNT) WITH NOWAIT;

    --===== Update the test table with ManagerID's. The ManagerID is some random

    -- value which is always less than the current EmployeeID to keep the

    -- hierarchy "clean" and free from "loop backs".

    UPDATE TempDB.dbo.Employee

    SET ManagerID = CASE

    WHEN EmployeeID > 1

    THEN ABS(CHECKSUM(NEWID())) % (EmployeeID-1) +1

    ELSE NULL

    END

    ;

    --===========================================================================

    -- Conditionally randomize the hierarchy to be more like real life

    --===========================================================================

    IF @pRandomize <> 0

    BEGIN

    --===== Alert the operator

    RAISERROR('Randomizing the hierarchy...',0,1) WITH NOWAIT;

    --===== Create a randomized cross reference list to randomize the

    -- EmployeeIDs with.

    SELECT RandomEmployeeID = IDENTITY(INT,1,1),

    EmployeeID

    INTO #RandomXRef

    FROM dbo.Employee

    ORDER BY NEWID()

    ;

    --===== Update the ManagerIDs in the Employee table with the new

    -- randomized IDs

    UPDATE emp

    SET emp.ManagerID = RandomEmployeeID

    FROM dbo.Employee emp

    JOIN #RandomXRef xref ON emp.ManagerID = xref.EmployeeID

    ;

    --===== Update the EmployeeIDs in the Employee table with the new

    --randomized IDs

    UPDATE emp

    SET emp.EmployeeID = RandomEmployeeID

    FROM dbo.Employee emp

    JOIN #RandomXRef xref ON emp.EmployeeID = xref.EmployeeID

    ;

    END

    ELSE

    BEGIN

    --===== Alert the operator

    RAISERROR('The hierarchy is not randomized',0,1) WITH NOWAIT;

    END

    ;

    --===========================================================================

    -- Build the indexes necessary for performance.

    --===========================================================================

    --===== Alert the operator

    RAISERROR('Building the keys and indexes...',0,1) WITH NOWAIT;

    --===== Add some indexes that most folks would likely have on such a table

    ALTER TABLE TempDB.dbo.Employee

    ADD CONSTRAINT PK_Employee PRIMARY KEY CLUSTERED (EmployeeID)

    ;

    CREATE UNIQUE INDEX By_ManagerID_EmployeeID

    ON TempDB.dbo.Employee (ManagerID,EmployeeID)

    ;

    ALTER TABLE dbo.Employee

    ADD CONSTRAINT FK_Employee_Employee FOREIGN KEY

    (ManagerID) REFERENCES dbo.Employee (EmployeeID)

    ON UPDATE NO ACTION

    ON DELETE NO ACTION

    ;

    --===========================================================================

    -- Exit

    --===========================================================================

    RAISERROR('===============================================',0,1) WITH NOWAIT;

    RAISERROR('RUN COMPLETE',0,1) WITH NOWAIT;

    RAISERROR('===============================================',0,1) WITH NOWAIT;

    GO

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)