Stairway to Advanced T-SQL Level 4: Record Level Processing Using Transact-SQL Cursors

  • Comments posted to this topic are about the item Stairway to Advanced T-SQL Level 4: Record Level Processing Using Transact-SQL Cursors

    Gregory A. Larsen, MVP

  • In the past I've found that SQL Cursor are slow and difficult to use. An alternative is to use a while loop with a table variable that stores a seeded integer column that then iterates on that seeded int value. I haven't performance tested on later version of SQL but there were significant performance gains when lasted tested in SQL 2005

    Enjoy!

    ----------------------------------------------------------------------------------------

    SET NOCOUNT ON

    /*Declare a table variable that you will iterate using the seeded integer value iRow */

    DECLARE @tLoopTable table(iRow int IDENTITY (1, 1) NOT NULL, TABLE_SCHEMA nvarchar(128), TABLE_NAME nvarchar(128) UNIQUE (iRow))

    /*Declare variables for the iteration count and the column data for each row */

    DECLARE @iRow int, @iMaxRow int, @TABLE_SCHEMA nvarchar(128), @TABLE_NAME nvarchar(128)

    /*Load the table variable */

    INSERT INTO @tLoopTable (TABLE_NAME, TABLE_SCHEMA)

    SELECT TABLE_NAME, TABLE_SCHEMA

    FROM INFORMATION_SCHEMA.TABLES WITH(NOLOCK)

    WHERE TABLE_TYPE = 'BASE TABLE'

    /*Get the row count and set the counter variables */

    SELECT @iRow = MIN(iRow), @iMaxRow = Max(iRow) from @tLoopTable

    Print 'Total Rows ' + rtrim(@iMaxRow)

    /*Loop the row counters*/

    WHILE @iRow <= @iMaxRow

    BEGIN

    /*set the column variables with the current row */

    SELECT @TABLE_SCHEMA = TABLE_SCHEMA, @TABLE_NAME = TABLE_NAME FROM @tLoopTable WHERE iRow = @iRow

    /*Do something with the current row data*/

    PRINT rTRIM(@iROW) + ' - '+@TABLE_SCHEMA + '.' + @TABLE_NAME

    /*Increment row counter */

    SET @iRow = @iRow + 1

    END

  • I did some performance checks and analysis on cursors a while back... you might also find these blog posts of mine interesting:

    http://bradsruminations.blogspot.com/2010/05/truth-about-cursors-part-1.html

    http://bradsruminations.blogspot.com/2010/05/truth-about-cursors-part-2.html

    http://bradsruminations.blogspot.com/2010/05/truth-about-cursors-part-3.html

    (Yikes! That was exactly 5 years ago this month!)

    I haven't read Greg's article yet... I'm sure he touched on many of the things I mentioned as well.

    --Brad

  • I use cursors once in a while. Thanks for the retraining to make sure I use the proper options.

  • This is a great article on Cursors, however, I wish the performance disclaimer was in bold and at the top of the article. I have worked at three different places now where over 60% of the regularly used sprocs templates, and even the built-in /automated procedures are riddled with cursors. In some cases, I've met a lot of resistance trying to clean those up. (I'm talking about bulk inserts and scripts that could easily be replaced with set logic.)

    It seems that the problem is we have a lot of brilliant programmers who migrate to writing TSQL and they carry their seniority in the company or in the field with them. They override their experienced SQL counterparts and the result is a lot of unnecessary cursors in templates and commonly used sprocs. It's frustrating to encounter these misused cursors and more frustrating to be forced to use them in my own code when I'm asked to start with an existing template containing cursors.

    And with that caveat, I'm going to keep this article handy because I love the concept of jumping around the relative records. This information is a real keeper. 🙂 Thanks for covering this!

  • Good article. Thank you.

    It's worth noting a little-known bit about cursors: They can be declared like any other variable. For instance,

    Declare @MyCursor Cursor;

    And subsequently SET to the select statement. I won't repeat other authors' works here, but you can search for "SQL Server cursor variable" to see how it's implemented. This is significant - should your process error out before the CLOSE; DEALLOCATE statements, resources are released like any other variable.

  • Paul Novelli (5/6/2015)


    Good article. Thank you.

    It's worth noting a little-known bit about cursors: They can be declared like any other variable. For instance,

    Declare @MyCursor Cursor;

    And subsequently SET to the select statement. I won't repeat other authors' works here, but you can search for "SQL Server cursor variable" to see how it's implemented. This is significant - should your process error out before the CLOSE; DEALLOCATE statements, resources are released like any other variable.

    Paul, that last statement blows me out of the water - one of my main peeves with CURSORs was that developers often fail to close and deallocate them. It is difficult to test that, though. If true, it should be easier to convince the developers to declare them with the @ symbol - I won't need to "kill" so many SPIDs!:-)

  • I really appreciated the article. I am a transplanted mainframe DB2 DBA and cursor processing was the standard on that platform. It had some additional features that allowed updates/inserts/deletes to occur without messing too much with concurrency. I was amazed when moving into the Windows/SQL Server platform that cursor processor was highly discouraged. That is why I read with interest what you wrote. Personally, as SQL Server continues to grow in stature, I believe there will be a customer demand to enable more cursor processing. Continuous operations is the destination for the database world. That means big tables have to have their batch processes run while someone in Singapore can look at, and update, his/her data (on Christmas morning no less). The way to do this is to put more automation into the product to allow for more cursor processing and not rely on SET processing. Some features would be to allow variable commits with repositioning and restartability should your SP fail.

  • You don't need cursor processing for what you're talking about Jim.

    You do bring up a point though. I would have expected an article on "Advanced T-SQL" to better explain what cursors should be used for, why they should generally be avoided, and some good examples on how to avoid them.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".
    "Dear Lord... I'm a DBA so please give me patience because, if you give me strength, I'm going to need bail money too!"

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (5/7/2015)


    ... I would have expected an article on "Advanced T-SQL" to better explain what cursors should be used for, why they should generally be avoided, and some good examples on how to avoid them.

    I can think of lots of cases where they aren't necessary and should be avoided.

    What are some examples where they make sense?

  • Amen to that.

    There are the usually obvious cases where they should be avoided, but the cases where they're ideal is never really covered. It's like cursors are a bad word, when they're a tool like anything else. They have their place. But what exactly is that place?

  • As I mentioned, continuous operations. SQL Server doesn't see that much of it yet, but it will be coming with a vengence. I don't mean "24x7" (which gets redefined for marketing reasons), but permanently continuous operations from the users' standpoint. When you have large table updates, SET processing will lock those users out. Cursor processing as I am mentioning would enable processing in small chunks -- basically closer to that of a transaction. That is, update a thousand rows at a time, then commit, then resume where you left off, repeat until completed. If something breaks and errors off, you fix it, restart the job, and it automatically picks up where it left off.

    I used to work for a large financial services company and that was the mode. It was a thousand times more complicated than typical failover processing. It was likened to having a fleet of 18 wheeler trucks and expecting to do your periodic service/maintenance on them without them stopping or slowing down (much). The company basically guaranteed users from Australia to Zimbabwe could access their financial data day or night. We pulled it off, but it was very difficult.

  • Oddly enough, Jim, those are SET based operations being controlled by a loop or "cursor"... just smaller sets. It's a tried and true method for doing (for example) massive deletes.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".
    "Dear Lord... I'm a DBA so please give me patience because, if you give me strength, I'm going to need bail money too!"

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • marcia.j.wilson (5/7/2015)


    Jeff Moden (5/7/2015)


    ... I would have expected an article on "Advanced T-SQL" to better explain what cursors should be used for, why they should generally be avoided, and some good examples on how to avoid them.

    I can think of lots of cases where they aren't necessary and should be avoided.

    What are some examples where they make sense?

    The only 3 places that I can think of using it are for ...

    1. Batch control like Jim has been talking about. Still, I prefer other methods than a cursor for this and only because I went so long without ever even thinking of using a cursor. This could include "stepping" through a list of filenames to import, doing "mega deletes" a couple thousand rows at a time, or running a stored procedure or bit of built in code that was (poorly) designed to do just one thing at a time (sp_send_dbmail is a good example).

    2. Another use is to run some form of meta-data change or check on multiple tables/things across databases. For example, you might write one to find a particular column name in all tables of all databases and tell you where it's been used.

    3. Last but not least, there are the rarities when only RBAR will do such as the ol' bin-stacking problem.

    For all the items, a well written cursor can easily tie and sometimes beat what someone might do with a Temp Table and While Loop. For 1 and 2, neither will be the source of a performance problem.

    For those getting ready to suggest recursive CTEs, those can actually be as bad or worse than a cursor or While Loop depending on what they're doing. In general, anything that counts (increments) one value at a time is going to be a real resource hog as an rCTE.

    The biggest problem with cursors, While Loops, and rCTEs that I've seen is that people do have a hard time figuring things out in a set based fashion and give up to quickly and resort to one of these forms of looping. There is a paradigm shift required that can most easily be stated as I have in my signature line below.

    p.s. And, yes... I agree that 1 and 2 can be done without a cursor or loop but the cursor or loop will not be the performance problem for either.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".
    "Dear Lord... I'm a DBA so please give me patience because, if you give me strength, I'm going to need bail money too!"

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Yes, I too would be interested in knowing when it is appropriate to use them. Jeff has written several articles describing how to avoid their use -- is there ever a case where using a cursor cannot be avoided?

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply