A Case FOR Cursors...

  • Alan.B (6/1/2015)

    So here's what's the takeaway here? It's better for an update to run for several extra seconds to prevent a several hundred ms blip for a single user? Rhetorical question.

    Rhetorical question, yes. But this is still an interesting discussion. I guess it depends on how you define "performant". Execution time & block times are only a small piece of the pie.

    What about TempDB impact if the update involved some sort of GROUP BY calculation?

    What about the load you're putting on the Transaction Log, and additional disk required? What if the log needs to grow and you don't have additional disk to throw at it?

    What if some exception occurs during the large update at the 95% mark (due to resource consumption maybe) and you need to rollback? Ugh..... 🙂

    What about latency in Transactional Replication subscribers? Even more so, what about the overall health of replication? A single large update could necessitate a complete reinitialization if the latency becomes too high, which would increase your downtime & customer impact even more.

    Guess the point I'm trying to make is for the initial question of "when is a cursor a good tool to use?" I would answer this simply by saying "when what you're doing needs to be invisible to others". Large transactions are not invisible, no matter what isolation level you use; maybe they are to other readers, but not anyone that may need to update a different field in the same table. Cursors do not need to be RBAR... they can be used in conjunction with set-based operations to update batches of data, and many times this will not dramatically affect the readability of the code.

    Bottom line = T-SQL developers need to be mindful of the FULL IMPACT that a simple update statement may have on concurrent users, the underlying database layer, and downstream systems.

    Great discussion everyone,

    -Ryan

  • Thanks for the article. Yes, cursors have a bad rap. But we have a few situations that the use of a cursor just makes the most sense.

  • Make that three things you should never discuss in public:

    1) Politics

    2) Religion

    3) Cursors (good OR bad)

    🙂

  • dbishop (6/2/2015)


    Make that three things you should never discuss in public:

    1) Politics

    2) Religion

    3) Cursors (good OR bad)

    🙂

    Ha ha, well, you know, everyone has been pretty polite I think; and although there has been a LOT of back and forth, I think several of us have learned from this. I feel like I did.

    You know, as one who believes we need to establish a fundamentalist dictatorship based on the teachings of the Flying Spaghetti Monster.

  • No politics!

  • I would not base any conclusion on cursors on the code in the MS databases. Some of it is really horrific. In fact, I have seen situations where the widely used (though undocumented) sp_MSForEachDB and sp_MSForEachTable return incorrect results - and this is caused by a bad choice of cursor type in the code.

    That being said, I do agree that, *IF* row by row processing is unavoidable, using a cursor is the best way. Not a cursor with the default options, they suck big time. But I have never seen any non-cursor method of iteration that out-performs a cursor that is defined as either STATIC (if small enough to fit in cache), or FAST_FORWARD (if bigger than available cache).


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Atradius (6/1/2015)


    I have never read the statement "do not use cursors" in any other way than interpreting it as "do not use row-by-row operations, use set operations instead".

    Neither have I, but I have seen the results of many people who do so. So the warning is absolutely valid - "do not use cursors" should not be interpreted as "find another way to iterate".

    One of my previous clients had several procedures that would stuff a set of IDs based on a WHERE clause in a temporary table, then enter a loop that processes a single row from that temp table and deletes it. By just replacing that with a static cursor, I was already able to gain a bit performance. (Unfortunately, I was not there long enough to do a full replacement with set-based code).


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Alan.B (6/1/2015)


    Until someone, anyone, can post an example where a cursor or other iterative method was faster than a well-written set-based solution I will remain convinced that the set-based solution is the way to go.

    A few years ago, I published a set of blog posts about the bin packing problem. I presented a few cursor-based solutions, a fully set-based solution, and then my own hybrid "iteration + set-based" solution. The fully set-based solution was far slower then the cursor-based solutions. This was before windowing functions were introduced so I cannot 100% certain say that this is still correct, but I personally do not know of any way to employ these to create a better set-based bin packing query.

    Link: http://sqlblog.com/blogs/hugo_kornelis/archive/tags/Bin+Packing/default.aspx

    Note that this series of blog posts was inspired by a real world problem (not from my own clients, but from a question I answered on internet).


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Hugo Kornelis (6/2/2015)


    So the warning is absolutely valid - "do not use cursors" should not be interpreted as "find another way to iterate".

    That may be the wisest comment in this thread.

    Don Simpson



    I'm not sure about Heisenberg.

  • Much of the sp_ that return from your query have to do with replication, which nobody much cares about, so there is no reason to expend energy cleaning up old Sybase code. I can tell you that the guys who actually write the code for SQL Server think Sybase was conceived of Lucifer and cursors are his spawn. Cursors are a throwback to mainframe flat file (ISAM) (I won't even call it data basing) days when tape ruled the planet ; and yes I am old enough to have been there when our IBM RD281 disk pack, read heads driven with hydraulic rams, blew a packing and saturated the innards with hydraulic fluid. YUCK.

    I:hehe:f you aren't spending your spare moments learning SET BASED SQL and BI Windows Function programming, shame shame. Discover the OVER clause.

  • Andy DBA (6/1/2015)


    When you say "Just because you can, doesn't mean you should" are you implying that SQL Developers initially think in terms of loops and then arbitrarily decide to write set-based "just because they can"?

    .....

    No, that's not what I'm saying. The vast majority of my T-SQL code has been set based, and I too have removed cursors from other dev's code because in most cases they're not needed. What I'm saying is that there *are* certain problems (and they aren't common and no I don't have one to post here) where a loop with top-down business logic is a clearer more maintainable way. My whole point is the mantra "Cursors are bad and if you use them you're an idiot" (which is how I read some of the posters here) is not neccessarily true.

    I never said that should be the first option, or even the preferred option--again--I'm saying cases do exist for cursors, but it seems the anti-cursor crowd has a religious zealousy and has the torches out like angry villagers chasing down the monsters who (gasp) have a cursor in their code. Sure, some of them need to be schooled. But I'm just here to make this point that they're not to be *banned*, and I don't care if you ridicule me.

    While I'll never tell anyone all my code is perfect--I've got plenty of code I look at and am embarrassed that I actaully wrote that--but I learn constantly and always take pride in what I do and in the cases where I've used cursors (at least those after I felt I'd become a more seasoned coder) I've been able to defend their existence based on readability and maintainability and--as any analyst should do--have considered the performance budget in using cursor in that particular code.

    --Jim

  • g.britton (6/1/2015)


    So, GOTOs have been "considered harmful" for almost 50 years!

    It's a little surprising to me that no one has riffed that title to write a "Cursors considered harmful" article yet.

    There is a big difference.

    Go to is generally considered harmful in 3GL programming because it tends to result in hard to understand "spaghetti" code. (That being said, there are exceptions to any rule, including this one).

    On the other hand, cursors are not specifically bad for maintainability - as we can see from some comments here, it's acquired taste. Once you are used to set-based code, you will not want to return to writing loops. But if that's what you've done all your life, then writing and maintaining cursor-based code will be much easier for you.

    However, the real argument against cursors is performance. Iterative code is so much slower than set-based in >99.9% of all cases. With one important caveat - this holds true for SQL Server, but not for other RDBMS's. I do not have first hand experience, but I heard from multiple sources that e.g. Oracle is optimized very differently than SQL Server; in Oracle there is apparently little to no performance difference between well written set based code or well written cursor-based code.

    So the correct title of the article would be "Cursors considered harmful for SQL Server".


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • I had a task that required iterating through a result set from a query. On a whim I tried it in a WHILE loop and then tried it by using a cursor against a temp table.

    The cursor was dramatically faster.

    Note that the temp table was dedicated to the process I was trying to run and therefore I had no concerns about concurrency.

    I used cursors so rarely that I always have to resort to BOL for the syntax. I'd say that the vast majority of the queries I write can be done without using loops of any kind.

  • David.Poole (6/2/2015)


    I had a task that required iterating through a result set from a query. On a whim I tried it in a WHILE loop and then tried it by using a cursor against a temp table.

    The cursor was dramatically faster.

    Can you post your code? My guess is that a cursor directly against the table (without temp table), and with the correct options, will be even faster than the cursor against a temp table. If you share your code, I can see if I can adapt it to avoid the intermediate step of the temp table.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • @james Stephens

    Got it! I agree with you. My apologies for the snide remarks, earlier. If an sp is run once a year and it loops through a handful of records, Row By Agonizing Row (RBAR), but only takes 5 seconds to run, it would be hard to justify spending a day to re-write it as set-based so it would run in half a second "just because you can". It wouldn't hurt to put a comment in it though, that said "Don't use this as template code!"

    Most of the zealots here are quick to agree that cursor (or other) loops can be good solutions in specific cases already mentioned in this thread. (Such as the third-party provided megalithic stored procedure that you have to call with one object at a time, or for "chunking" huge operations.) It's RBAR looping through large datasets that gets the angry villagers' torches lit.

    I won't ridicule anyone for saying cursors shouldn't be banned. In fact I do have an example where it seems cursors are faster than set-based but I'm fairly sure it's because my set-based approach has hidden RBAR. I'll post it after I give up and with any luck I'll get schooled and learn something.

Viewing 15 posts - 76 through 90 (of 215 total)

You must be logged in to reply to this topic. Login to reply