Eliminating Cursors

  • Comments posted to this topic are about the content posted at http://www.sqlservercentral.com/columnists/kAli/eliminatingcursors.asp

  • Just a few code corrections (being pedantic), the insert won't work unless you provide the list of columns that you want to insert into....

    INSERT INTO @item_table

    (item_category_id, order_id)

    SELECT -- Same SELECT statement as that for the CURSOR

    it.item_category_id

    ,ord.order_id

    FROM dbo.item_categories it

    Also, another performance increase would be to set the @loop_counter variable to @@rowcount after you have done the insert into the table variable (depending on how many records are in the table this could increase performance several fold)

    --now populate this table with the required item category values

    DECLARE @loop_counter INT

    INSERT INTO @item_table

    (item_category_id, order_id)

    SELECT -- Same SELECT statement as that for the CURSOR

    it.item_category_id

    ,ord.order_id

    FROM dbo.item_categories it

    INNER JOIN dbo.orders ord

    ON ord.item_category_id = it.item_category_id

    WHERE ord.order_date >= '1-sep-05'

    and it.isSuspended != 1

    SET @loop_counter = @@ROWCOUNT

    DECLARE @item_category_counter INT

    Further to this can anyone give me stats on when using this method becomes more efficient than using cursors. i.e. how many records selected in the cursor, machine memory config etc. etc. etc.

    I have been using cursors for years (coming from an Oracle background) and have not had problems with them!!

    Thanks

  • Although it is best practice to name columns, and especially in such article it would be advisable to do it correctly, it is not true that the insert will not work. The insert will work as it is written by the author.

    The setting of @loop_variable is done only once in the entire process, so I doubt the proposed change would increase performance several fold. But, same as Bob, I would be very interested in some closer info about why replacing cursor with a WHILE loop increases performance by 75%. The loop in example generally does the same as the cursor, iterates row after row and for each row starts some procedure... so where is the huge gain? Better use of indexes? Reused query plan?

    IMHO the main problem with cursors is, that some people tend to use them there, where it is possible to use a true set-based approach, i.e. update 1000 rows at once instead of updating one after another. I'm not that much sure about the shown example... it is one of the things where I think cursors aren't really too bad. Since I never had to use cursors, I prefer to avoid them altogether - but if you are familiar with them, this could be one of the situations where cursor is OK. I'm really curious what some of the more experienced posters will say...

  • Tks for that best article but, i have a question...

    Well, i have a lot of procedures that use cursors running in background and the server has low performance all day long.

    If we have more than 1.000.000 rows returned in my query, that stil is the best option?

    Best regards,

    Fabiano

     

  • Sorry, are you saying that cursors are the best option or loading into a table variable and then iterating through it is the best option.....

  • I have used this method for years (because I hate cursors), but does it slightly different. I use a #Temp table and then select the TOP 1 at the beginning of the loop with a delete at the end of the loop. I would recommend this method to anybody in a position that HAS to use a cursor.

    5ilverFox
    Consulting DBA / Developer
    South Africa

  • (The comments are to the author of the article)

    Have you tried to convert the cursor to a static cursor?

    Because, in my opinion, that's what you are doing with a temporary table (defining the result set "A priori"). I have seen great improvements with this simple modification DECLARE curs_name CURSOR to DECLARE curs_name CURSOR STATIC (assuming you can ignore the modifications to the original result set).

    I always try to achieve a set based solution, but if you are only trying to speedup that operation, with the least amount of change, sometimes it pays to play with the cursor options. Of course, with a temporary table, you can more easily migrate to a set based solution.

    More problematic (at least, finding an elegant situation) is to use set operations when the tables use an identity column (especially using the identity of one table as the foreign key of another table).

    Best regards,

    André Cardoso

  • I can't understand, where has this hate of cursors come from. Is it a general DBA thing or something else.

    I would have thought that using a cursor/recordset would be much simpler to control than selecting into temp tables/table variables etc.

    Is it a performance hangover from SQL 7, I just do not see any performance improvement in 2000.....

    Bob

  • You can also add PRIMARY KEY to the table definititon

    DECLARE @item_table TABLE (primary_key INT IDENTITY(1,1) NOT NULL PRIMARY KEY,

                                        item_category_id INT,

                                        order_id INT )

    The followings SELECT are better

    SELECT @item_category_id = item_category_id ,@order_id = order_id

    FROM @item_table

    WHERE primary_key = @item_category_counter

     

  • I think it's worth making the point that this approach does nothing to combat what is fundamentally bad about cursors. All we have done here is created a cursor, without using the declare cursor syntax.

    The article states:

    <However, this kills the whole theory of set-based operations, which are inherently faster than their row based counterparts. The following is one way of converting your CURSOR statement using the WHILE keyword in such a scenario>

    Yes, but this still isn't set based is it? It's still doing everything row by row, which is exactly what we want to avoid.

    Much better to take a step back and see if you process many, or all of the purchase orders in a small number of set based sql statements, and try not to take such a procedural approach.

    I'd also say that the 75% performance improvement is highly dubious. How can this be? Presumably, the work is being done within the usp_generate_purchase_order procedure. Are we really saying that the looping code was the bottleneck? If this process took 1 min originally, are we saying that by substituting the cursor for the while loop reduced it to 15 secs? I really don't think so.

    I'm not saying that this construct should never be used. There are many cases when you might want to slow things down (archiving/purging/data moving for example) to aid concurrency and reduce the size of transactions, but this is not a good pattern to use in high performance database programming.

     

  • I tested the loop solution and it hardly offered any performance improvement over cursor. I would like to have test specification that showed 75% improvement.

  • I've used both methods of processing, and from my experience I would say the biggest gain was from inserting the data into a temporary table first. A cursor created from the same set of static data would probably yield the same performance.

  • I think the performance gains are obtained when the stored procedure updates/interferes with the result set.

    In that case, the performance gains can be significative.

  • I thought SQL Server Central was a place where developers posted original authorship.  Clearly if this were original authorship, the author would have done his due diligence and figured out that his procedure code was not working efficiently due in part to TempDB filling up.  Moreover, the author would highlighted the margin of diminishing returns with respect to using TempDB versus memory as a means of gaining response times to the caller functions in his alleged plagiarized article.  In fact, the author of this article could have made mention as to the source of his findings and perhaps pointed his readers to said source.  For example, what title/publisher or what Url, or who the original author was that this author copied, pasted, and called-his-own this age-old technique.

  • Ah yes, because the cursor is non-static it would have to check for updates to its data before the next fetch.

    If the sp is updating the data the cursor uses then this would be costly.

    I would therefore go beck to using a static cursor (ir it fetches all the data first then iterates through the recordset).

    I have been doing a test just iterating through 800K rows of a table, fetching data only...

    A static cursor took 8.5 secs to iterate through the whole lot

    A noral cursor took 17.5 secs

    and the table variable based approach took 30 secs!!!!

    This was all running on my machine not the server, I will maybe test same scenario on the server and report back...

    Cheers

    Bob

Viewing 15 posts - 1 through 15 (of 296 total)

You must be logged in to reply to this topic. Login to reply