A Case FOR Cursors...

  • You are correct. If you scroll up to Jeff's reply to Hugo you'll see he mentions that usage of cursors.

    An alternative to granting select on the dbo schema could also be:

    set @SQL = 'ALTER ROLE db_datareader ADD MEMBER ' + @RoleName

    execute sp_executesql @SQL

    This would include tables/views in all user schemas including dbo. (Tables/views that are not in the dbo schema would break the sp_executesql part of your script.)

    But, depending on what you're doing, you might not want to grant blanket read access to other schemas.

    One nice thing about grants to a schema or adding members to built-in roles is that you don't have to explicitly manage permissions every time you add a new object.

  • [deleted]

  • patrickmcginnis59 10839 (7/6/2015)


    Again I am not sure why Oracle can make their PL/SQL using cursors as fast as set based operations and MS cannot. Cursors are not evil, and not bad, just a different method of processing data.

    There are cases where you want set oriented (or as a better generalization in my opinion, "declarative code"), and there are clearly reasons why T-SQL is slow, I have a nice stack of theories and opinions on that. But diving into conversations like this get TREMENDOUSLY UNPOPULAR here in my experience, but it makes sense though, the folks here aren't database server implementers, they are database server users, and for that matter, a great majority of folks have no interest in generalizing their knowledge past the MS product base. Such is the nature of the forum.

    But yeah, T-SQL sucks giant gobs of performance out of looped code, and in my opinion, even Microsoft understands this, hence their new compiled implementations in 2014. Sometimes you just gotta have a faster stack, and Microsoft is not going to miss out on that if they can help it.

    With that I have to ask, what is YOUR definition of a "database Implementer"?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • [deleted]

  • patrickmcginnis59 10839 (7/6/2015)


    Again I am not sure why Oracle can make their PL/SQL using cursors as fast as set based operations and MS cannot. Cursors are not evil, and not bad, just a different method of processing data.

    There are cases where you want set oriented (or as a better generalization in my opinion, "declarative code"), and there are clearly reasons why T-SQL is slow, I have a nice stack of theories and opinions on that. But diving into conversations like this get TREMENDOUSLY UNPOPULAR here in my experience, but it makes sense though, the folks here aren't database server implementers, they are database server users, and for that matter, a great majority of folks have no interest in generalizing their knowledge past the MS product base. Such is the nature of the forum.

    But yeah, T-SQL sucks giant gobs of performance out of looped code, and in my opinion, even Microsoft understands this, hence their new compiled implementations in 2014. Sometimes you just gotta have a faster stack, and Microsoft is not going to miss out on that if they can help it.

    I'm not really sure what the point of this post is, but I feel vaguely insulted. Are you implying there's no value in learning set-based approaches because Microsoft will soon have lightning fast cursors (instead of sucking "giant gobs of performance") so they can compete with Oracle's cursor speed?

    "...the folks here aren't database server implementers, they are database server users..."

    Hmmm, I think the reason most of us are here is clear from the title on this site's home page:

    Welcome to SQLServerCentral.com

    A Microsoft SQL Server community of 1,813,218 DBAs, developers and SQL Server users

    and I'll go so far as to agree it's a fair assumption that most folks here aren't looking for RDBMS design content. But it's your conclusion that none of us are "database server implementers". Seriously though, how many people on this planet can truly claim that title? (Do you?) I'm going to guess considerably less than the 1,813,218 folks in this community.

    "...and for that matter, a great majority of folks have no interest in generalizing their knowledge past the MS product base."

    Now that's a bold assumption. Do you know what any of our interests are? (Also, doesn't your "nice stack of theories and opinions" on "reasons why T-SQL is slow" fall within the MS product base category?) If I do have other areas of interest I'll find sites where those interest areas are on topic.

    Also, I think the source of your anonymous quote edited their original post to be more accurate. I doubt you'll find any experienced Oracle DBA who claims "Oracle can make their PL/SQL using cursors as fast as set based operations" at least not for record counts of any significance.

    Finally, I think the performance gap between Microsoft's cursor and set based code is a good thing because it encourages SQL developers to use set based code which, I understand, is how relational databases are designed to be used. But what do I know? I'm not an RDBMS engineer.

  • [deleted]

  • patrickmcginnis59 10839 (7/6/2015)


    Jeff Moden (7/6/2015)


    patrickmcginnis59 10839 (7/6/2015)


    Again I am not sure why Oracle can make their PL/SQL using cursors as fast as set based operations and MS cannot. Cursors are not evil, and not bad, just a different method of processing data.

    There are cases where you want set oriented (or as a better generalization in my opinion, "declarative code"), and there are clearly reasons why T-SQL is slow, I have a nice stack of theories and opinions on that. But diving into conversations like this get TREMENDOUSLY UNPOPULAR here in my experience, but it makes sense though, the folks here aren't database server implementers, they are database server users, and for that matter, a great majority of folks have no interest in generalizing their knowledge past the MS product base. Such is the nature of the forum.

    But yeah, T-SQL sucks giant gobs of performance out of looped code, and in my opinion, even Microsoft understands this, hence their new compiled implementations in 2014. Sometimes you just gotta have a faster stack, and Microsoft is not going to miss out on that if they can help it.

    With that I have to ask, what is YOUR definition of a "database Implementer"?

    Database ->server<- implementers. They're the ones making the decisions on how fast or slow T-SQL is, at least in Microsoft's case! To me, its a very interesting subject even if I don't have the talent to do coding at that level, but like you said, whether or not T-SQL or Microsoft brand cursors can be faster or not is not what you're asking, right?

    Specifically, the following seemed to be a whole lot condescending on your part...

    the folks here aren't database server implementers, they are database server users, and for that matter, a great majority of folks have no interest in generalizing their knowledge past the MS product base. Such is the nature of the forum.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (7/8/2015)


    Specifically, the following seemed to be a whole lot condescending on your part...

    the folks here aren't database server implementers, they are database server users, and for that matter, a great majority of folks have no interest in generalizing their knowledge past the MS product base. Such is the nature of the forum.

    Fair enough, my mistake, I'll keep out of your threads!

  • patrickmcginnis59 10839 (7/8/2015)


    Jeff Moden (7/8/2015)


    Specifically, the following seemed to be a whole lot condescending on your part...

    the folks here aren't database server implementers, they are database server users, and for that matter, a great majority of folks have no interest in generalizing their knowledge past the MS product base. Such is the nature of the forum.

    Fair enough, my mistake, I'll keep out of your threads!

    Heh... there you go again... jumping to conclusions. 😀

    You don't have to keep out of my threads. In fact, please don't keep out. You've got some good stuff and I'm interested in what you have to say. I just took exception to you effectively calling me (and everyone else on these forums) a "user". 😉 I don't believe that you or anyone else on these forums is qualified to make that judgment.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Without using specific examples to test, it is hard to explain why set-based processing outperforms cursor-based code. Logging certainly plays a part as does the frequency of posting updated pages in memory to disk. But is the cursor being used in a read-only manner or being used to drive inserts and updates? Is the amount of data so small that any joins would be nested loop joins? Is it holding data in memory regardless of the transactions being posted to the underlying table? "It depends" to coin a cliche.

    I know that one of our developers once used a cursor to read through a table once and do inserts to three separate tables, resulting in death by a thousand cuts as each insert had to be done individually. He mistakenly assumed this would be less work on the system than just reading the file three times and doing bulk inserts to each of the tables.

    As for performance of the relative databases:

    http://arxiv.org/ftp/arxiv/papers/1205/1205.2889.pdf

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • The Dixie Flatline (11/16/2015)


    Without using specific examples to test, it is hard to explain why set-based processing outperforms cursor-based code. ..."It depends" to coin a cliche.

    This may be one of those cases where is does not depend. Cursors are never the best solution.

    There are tons and tons of tests on SQLServerCentral.com and elsewhere showing the performance of cursors vs set-based methods for solving a number of problems. I have never seen a case where there was not a better performing solution to a cursor even when correctly written. Here's a free PDF sampler of Itzik Ben-Gan's Microsoft SQL Server 2012 High-Performance T-SQL Using Window Functions. Take a look at page 5 beginning with Declarative Language and Optimization. That will give you a good overview with the problem with cursors and why set-based is the way to go. Not included in the sampler (way later in the book) are a number of solutions to common problems such as Running Totals, Max Concurrent Intervals, Gaps and Islands, Packing Intervals. In each case he tests different methods including cursors and the cursor is never the fastest solution.

    And even if there was a case where a cursor was 10-20% faster than every set-based alternative I would not go that route because cursor code is a drag. At least a loop is easy to understand. Cursors code is for the birds.

    As for performance of the relative databases:

    http://arxiv.org/ftp/arxiv/papers/1205/1205.2889.pdf

    FYI... There is no mention of cursors in this document. 😉

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Alan.B (11/16/2015)


    The Dixie Flatline (11/16/2015)


    Without using specific examples to test, it is hard to explain why set-based processing outperforms cursor-based code. ..."It depends" to coin a cliche.

    This may be one of those cases where is does not depend. Cursors are never the best solution.

    There are tons and tons of tests on SQLServerCentral.com and elsewhere showing the performance of cursors vs set-based methods for solving a number of problems. I have never seen a case where there was not a better performing solution to a cursor even when correctly written. Here's a free PDF sampler of Itzik Ben-Gan's Microsoft SQL Server 2012 High-Performance T-SQL Using Window Functions. Take a look at page 5 beginning with Declarative Language and Optimization. That will give you a good overview with the problem with cursors and why set-based is the way to go. Not included in the sampler (way later in the book) are a number of solutions to common problems such as Running Totals, Max Concurrent Intervals, Gaps and Islands, Packing Intervals. In each case he tests different methods including cursors and the cursor is never the fastest solution.

    And even if there was a case where a cursor was 10-20% faster than every set-based alternative I would not go that route because cursor code is a drag. At least a loop is easy to understand. Cursors code is for the birds.

    As for performance of the relative databases:

    http://arxiv.org/ftp/arxiv/papers/1205/1205.2889.pdf

    FYI... There is no mention of cursors in this document. 😉

    What about sending out customized emails? Or any number of administrative tasks? There are absolutely cases where a cursor is the best option. And it has been demonstrated many times that a well written cursor will outperform a while loop. The problem is when a cursor is used for DML operations.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Sean Lange (11/17/2015)


    Alan.B (11/16/2015)


    The Dixie Flatline (11/16/2015)


    Without using specific examples to test, it is hard to explain why set-based processing outperforms cursor-based code. ..."It depends" to coin a cliche.

    This may be one of those cases where is does not depend. Cursors are never the best solution.

    There are tons and tons of tests on SQLServerCentral.com and elsewhere showing the performance of cursors vs set-based methods for solving a number of problems. I have never seen a case where there was not a better performing solution to a cursor even when correctly written. Here's a free PDF sampler of Itzik Ben-Gan's Microsoft SQL Server 2012 High-Performance T-SQL Using Window Functions. Take a look at page 5 beginning with Declarative Language and Optimization. That will give you a good overview with the problem with cursors and why set-based is the way to go. Not included in the sampler (way later in the book) are a number of solutions to common problems such as Running Totals, Max Concurrent Intervals, Gaps and Islands, Packing Intervals. In each case he tests different methods including cursors and the cursor is never the fastest solution.

    And even if there was a case where a cursor was 10-20% faster than every set-based alternative I would not go that route because cursor code is a drag. At least a loop is easy to understand. Cursors code is for the birds.

    As for performance of the relative databases:

    http://arxiv.org/ftp/arxiv/papers/1205/1205.2889.pdf

    FYI... There is no mention of cursors in this document. 😉

    What about sending out customized emails? Or any number of administrative tasks? There are absolutely cases where a cursor is the best option. And it has been demonstrated many times that a well written cursor will outperform a while loop. The problem is when a cursor is used for DML operations.

    I should have been clearer: my comment was in response to:

    Without using specific examples to test, it is hard to explain why set-based processing outperforms cursor-based code. ..."It depends"...

    So the context was data retrieval. But yes, when cursors are used for what they were intended to do I agree that they are fine.

    For most administrative tasks I still personally prefer a loop because they're just easier to write and understand. If my only choice was cursor or a loop for something like a running total cursors, specifically firehouse cursors, are the was to go (if you had to choose between the two.)

    A problem with cursors is when you do a Google search for "sql cursor" 90% of the pages, including msdn, have examples of cursors for basic DML operations.

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Alan.B (11/16/2015)


    This may be one of those cases where is does not depend. Cursors are never the best solution.

    Even in pure data manipulation cases, this is not always the case.

    On SQL Server 2008R2 and before, running totals are probably the best example. I have never found or seen a method for computing running (pre SQL Server 2012) that has comparable performance to a well-programmed cursor solution, except for the methods that abuse undocumented and hence unreliable behavior - not safe for production use in my opinion. (There is one specific version that has a smart fail-safe switch that will make it error instead of producing incorrect results when the undocumented trick back-fires; that would be the only method I would ever seriously consider, but only if I really had no other choice, and I would try to get someone else on pager duty).

    After SQL Server 2012 a lot of those common situations did get fast set-based solutions, thanks to the windowing extensions for the OVER clause. But there are still exceptions. One of those is the bin packing problem. I wrote about this a few years ago. You can find my articles at http://sqlblog.com/blogs/hugo_kornelis/archive/tags/Bin+Packing/default.aspx. The short version is that there is a very fast and quite efficient algorithm that uses a comination of iteration (cursors) and set-based work, several not so fast solutions with different efficiencies that are fully cursor-based, and a fully set-based algorithm that scales so bad that it would probably riun for years on even a below-average size data set.

    I wrote this before SQL Server 2012, and I must admit that I never looked in earnest at how the new language features could help. But I personally think that the situation is unchanged all the way up to SQL Server 2016 (except that, of course, using memory-optimized tables and natively compiled procedures would speed up all methods). However, if you think that there is a correct and scaling fully set-based solution, then I am all ears!

    I do agree with the message I think you intended to write, that cursors are almost never the best solution, and that they should only be used if one has truly exhausted all other options. But there are, unfortunately, exceptions.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • My experience is with Sybase, not SQL server, but there are two areas I have always found cursors (or loops if you wish, doesn't much matter) very helpful and far better than a direct set update of data. There may be ways around this but if so I'd be interested.

    1. When dealing with small number of rows or even a lot of data but within your resource handling capabilities set updates are great. However, when you're updating millions of rows that push your resource limits I have time after time found that a direct update (which is a single transaction for all these rows) can by practical experience grind the system almost to a halt and is completely impractical. Just put it in a cursor and it works great. And much faster.

    2. This may be related but when a complex application uses a lot of triggers (which to me is very normal), you are putting a million rows at once into the inserted and deleted tables, which often have no or very poor indexes. To go through and handle that much data with multiple actions being taken on each row, and different actions being taken on different rows with different updates and checks needed is impractical. It will take forever. Use a cursor and it's a piece of cake. Then the triggers only have to handle a single row at a time. Or you can even code it to grab a group of rows as long as its manageable. The point being by limiting the total rows you're handling at once on a system with a lot of data, it can be much more workable than trying to do actions on all that data at once.

    In both the above cases, it allows me to do updates during the day instead of only batched overnight when no one cares if the entire system is locked up.

    Outside of those (off the top of my head), set updates are definitely faster.

    One other point that seems to be in disagreement with some other comments is that I find Sybase cursors much more understandable in a self-documenting way than complex series of set updates to handle a bunch of different cases. It's very linear so very easy to document and follow the train of actions.

    Of course SQL Server may have other handlings for these or different issues--but for what it's worth the above I've run into enough times I actually don't know what to say. But I'm highly interested in big differences in performance when dealing with large sets of data, so if I'm missing something I'm interested.

Viewing 15 posts - 151 through 165 (of 215 total)

You must be logged in to reply to this topic. Login to reply