Updating through cursors

  • Comments posted to this topic are about the item Updating through cursors


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • I used the following extract from BOL ms-help://MS.SQLCC.v9/MS.SQLSVR.v9.en/udb9/html/7b28942f-252b-4448-88bc-dedbe53b0a82.htm (Changing Rows with Positioned Operations - Performing Positioned Updates with APIs) to kinda-guess the right answer.

    "After positioning in the cursor, execute an UPDATE or DELETE statement with a WHERE CURRENT OF clause referencing the name returned by SQLGetCursorName. But this method is not recommended. Instead, it is better to use the positioned update functions in the ODBC API."

    Was it a fluke, or is it some indirect help from BOL? Or maybe direct help - cursors are a bit of a *gasp* grey area for me!

  • I find this question a little confusing. It should be clearly stated in the question that you do not need to see updates made to the underlying data. Because that is a requirement for the suggested answer.

    Best Regards,

    Chris Büttner

  • I agree that this question was a bit confusing. I would never have chosen the "right" answer just because this is not any way to update the cursor, but the underlying data without using the cursor...

    I always try to avoid using cursors and almost always succeed, but when I use cursors I always use FAST_FORWARD.

    -- Gianluca Sartori

  • brewmanz (7/24/2008)


    I used the following extract from BOL

    (...)

    Was it a fluke, or is it some indirect help from BOL? Or maybe direct help - cursors are a bit of a *gasp* grey area for me!

    Hi Brewmanz,

    That bit in Books Online is about client-side cursors, not about T-SQL cursors. When I submitted the question, it didn't occur to me to think about possible confusion with client-side cursors. That's the problem with beiing a "pure" T-SQL coder - one often tends to forget that the data is eventuallly served to and consumed by a client!


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Christian Buettner (7/24/2008)


    I find this question a little confusing. It should be clearly stated in the question that you do not need to see updates made to the underlying data. Because that is a requirement for the suggested answer.

    Hi Christian,

    I will readily admit that it had never occured to me that anyone would want to use a cursor to update data and then go back over the same data and read it again! (I already have problems realizing that there are people that use cursors on a regular basis at all!!)

    However, after reading your reply I decided to go back and test it for myself, for I have never read anything to support this requirement. Here is the test code I used:

    create table test (a int primary key, b int)

    insert into test (a, b) select 1, 1 union all select 2, 2

    go

    declare @a int, @b-2 int;

    declare c cursor scroll

    for select a, b from test order by a

    for update of b;-- Comment this line to test alternative

    open c;

    fetch first from c into @a, @b-2;

    select @a, @b-2;

    update test set b = 10

    where current of c;-- Comment this line to test alternative

    --where a = @a;-- Uncomment this line to test alternative

    fetch next from c into @a, @b-2;

    select @a, @b-2;

    fetch first from c into @a, @b-2;

    select @a, @b-2;

    close c;

    deallocate c;

    go

    drop table test;

    go

    As you'll see if you run this code, you get the changed value back both when using FOR UPDATE and WHERE CURRENT Of, and when not using FOR UPDATE and using WHERE a = @a. All variations of cursor options I tried either showed the same result, or resulted in error messages because of illegal option combinations.

    I'd apppreciate it if you can post any code that does show that changes are only visible when using FOR UPDATE.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Gianluca Sartori (7/24/2008)


    I agree that this question was a bit confusing. I would never have chosen the "right" answer just because this is not any way to update the cursor, but the underlying data without using the cursor...

    Hi Gianluca,

    I agree that the words "need to change the data in the cursor rows" (emphasis added) can be a little confusing. With hindsight, I should have formulated this as "need to change the data retrieved by the cursor" or something similar.

    But the three answer options given all do the same thing: update data in the underlying table. So even if there does exist some way to update just the cursor without updating the underlying data (which I doubt, given that a cursor is in fact a positioning mechanism and not a data colection), it should have been clear from the answer options that the intention is to change the data in the underlying table.

    I always try to avoid using cursors and almost always succeed, but when I use cursors I always use FAST_FORWARD.

    You might wish to check out the question for July 8, which deals with FAST_FORWARD and other options for fast (readonly) cursors.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Hi Hugo,

    This is a misunderstanding. What I meant to say was that when you open a static cursor, you have a point in time copy of the data as it was when you opened the cursor.

    You will not see any updates made to the underlying data after that, since you are working with a copy of the data, not the stored data itself.

    Best Regards,

    Chris Büttner

  • Hugo Kornelis (7/24/2008)


    Gianluca Sartori (7/24/2008)


    I agree that this question was a bit confusing. I would never have chosen the "right" answer just because this is not any way to update the cursor, but the underlying data without using the cursor...

    Hi Gianluca,

    I agree that the words "need to change the data in the cursor rows" (emphasis added) can be a little confusing. With hindsight, I should have formulated this as "need to change the data retrieved by the cursor" or something similar.

    But the three answer options given all do the same thing: update data in the underlying table. So even if there does exist some way to update just the cursor without updating the underlying data (which I doubt, given that a cursor is in fact a positioning mechanism and not a data colection), it should have been clear from the answer options that the intention is to change the data in the underlying table.

    Hi Hugo, thanks for your reply!

    I don't know how SQLServer handles internally the updating of the cursor using CURRENT OF, but I think it simply identifies the row to update in the underlying table from the position in cursor, without issuing additional statements.

    I found myself working with an old VB6 application some years ago, using DAO to access data. DAO handles server-side cursors better than client-side cursors and I used to check for long running queries with profiler: updates to a recordset resulted in some sort of sp_cursor in profiler.

    I don't know exactly what sqlserver does when you update with CURRENT OF, but I would expect something similar to this.

    I always try to avoid using cursors and almost always succeed, but when I use cursors I always use FAST_FORWARD.

    You might wish to check out the question for July 8, which deals with FAST_FORWARD and other options for fast (readonly) cursors.

    I'm one of those who would have answered "it depends", if only the option was there.

    -- Gianluca Sartori

  • Christian Buettner (7/24/2008)


    This is a misunderstanding. What I meant to say was that when you open a static cursor, you have a point in time copy of the data as it was when you opened the cursor.

    You will not see any updates made to the underlying data after that, since you are working with a copy of the data, not the stored data itself.

    Hi Christian,

    That's true. The options STATIC and FOR UPDATE are mutually exclusive, though.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • or because you run into one of the very rare problems where a cursor actually runs faster than a set-based solution

    Hi Hugo,

    Do you have an example of when that might occur? I ask because so many people write hidden RBAR into solutions such as running total problems and call them "set based" just because they don't contain an explicit Cursor or While loop.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (7/24/2008)


    or because you run into one of the very rare problems where a cursor actually runs faster than a set-based solution

    Hi Hugo,

    Do you have an example of when that might occur? I ask because so many people write hidden RBAR into solutions such as running total problems and call them "set based" just because they don't contain an explicit Cursor or While loop.

    Hi Jeff,

    I've got a few, actually. 🙂

    1: There is already a tried and tested stored procedure that processes a single row at a time, and the logic in it is quite complex. The estimated cost to rewrite the logic into a set based query is too high to justify it, especially since the total time taken is not problematic (for instance, a ten-minute batch running in a six-hoour maintenance window).

    2: Common problems such as running aggregates or string concatenation - typically tasks better suited for offloading to the client, but that's not always feasible. There are some well-known solutions to these problem, but they have their own problems. For starters, most are undocumented and not guaranteed to always work, which should be a major showstopper for any serious application. Besides, these also have a problem with multiple sets (e.g. running totals for the output, but reset when a new branch starts).

    The one solution I do know that is documented and guaranteed is the FOR XML trick for string concatenation - and that one has its own issues when you can't guarantee the absence of special XML characters (lesser than, greater than, and ampersand) in the data.

    3: Some problems known to be "NP-complete" - basically, this means that the "best" solution can only be found by trying every possible permutation. Doing this iterative means you can eliminate many possiblilities by deciding to cut a branch short if it's already clear that it won't ever become good (like a chess program might do: if a move causes it to lose its queen, there's really not much point in calculating all the possibilities for the moves after that). In a set-based solution, that is not possible as you have to rely on the optimizer "seeing" this as a way to speed up execution - and though that might happen in simple situations, the real situations are often too complex for the optimizer to see this.

    Though these are all examples where cursor-based is better than set-based, they are not all examples where cursor-based is best. Some (not all!) of the problems in categories 2 and 3 are better solved using a technique that I call "set-based iteration" and that blends set based and iterative elements. I 'll have a session about this techique at this year's PASS Summit.

    If you check my blog, you'll see that I started a series on "bin packing" (an example of an NP-complete problem). Lack of time has caused some severe delays in finishing and publishing parts four (that will show that a set-based approach really only can serve to test the newest and hottest 32-way quad-code processors with 500 zettabytes of RAM but not as a viable real-life solution) and five (where I will present an alternative solution using set-based iteration and show how it blows both the cursorbased and the setbased solutions). This example will probably also be included in my PASS presentation. :hehe:


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Thanks for taking the time for that feedback, Hugo. I appreciate it.

    Running totals are no excuse for using cursors... not even for display.

    http://www.sqlservercentral.com/articles/Advanced+Querying/61716/

    So far as the "bin packing" thing goes, I too am remiss... I said I think I know a set based way to do it and I just haven't gotten back to it. My appologies to you and myself because it's definitely an interesting problem and sounds like a lot of fun.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (7/24/2008)


    Thanks for taking the time for that feedback, Hugo. I appreciate it.

    Running totals are no excuse for using cursors... not even for display.

    http://www.sqlservercentral.com/articles/Advanced+Querying/61716/

    Hi Jeff,

    Just as I appreciate your (and all other!) comments here!

    I have read that article, and some of the comments (but with over 20 pages of them, I hope you forgive me for not reaching the end ;)). I won't be adding to that, but I will explain here why I would not use that method in production code:

    The index hint will tell SQL Server to use that index, but it does not tell it how to use it. I see three major risks:

    1) SQL Server might chose to do an unordered index scan, in which pages are read in allocation order instead of in the logical order dictated by the next page/prev page pointers.

    2) There is also the possibility of a merry-go-round scan (http://msdn.microsoft.com/en-us/library/ms191475.aspx). The TABLOCKX hint will effectively prevent most of this problem - with the exception of table scans that use NOLOCK or the READ UNCOMMITTED transaction isolation level.

    3) If the plan for the update includes parallelism, you'll definitely get incorrect results as well. I understand from the comments to your article that I did read that you tried to repro this yourself. This is a bit unfortunate, for (as far as I know) no current version of SQL Server will ever use parallellism for data modifications; this is so far limited to selecting only. However, with the increase of multi-socket and multi-core servers, I think it's only a matter of time until Microsoft does add the possibility of parallel plans for inserts, updates, and deletes. At that point, your code will break.

    I have tried multiple times to break your code using one of the first two risks. I failed. I was unable to force an unordered scan for the update; I never figured out if that's because of something in my desktop, or because SQL Server understands that this update needs an ordered scan. I was also unable to repro a merry go round scan on my desktop - both for this update and for two near-simultaneous select statements, so that is probably due to something in my desktop.

    So, I have to applaud you for presenting a very efficient and, based on tests so far, apparently very safe solution. It IS indeeed a very clever and creative solution. BUT ... in production code, "apparently very safe" is not enough. One day, Microsoft might tweak their optimizer a bit, and if that affects the way this update is processed, you'd be SOL.

    If an important report for upper management shows incorrect figures, I desperately want to be able to point to Microsoft and say it's *their* fault for building a database that doesn't work as documented. Since the behaviour you're using here is not documented, I'd be unable to do that if stuff starts hitting the fan, and it'll be *my* head on the chop-block. :w00t:

    Do you rmember how everyone started creating views with TOP 100 PERCENT and ORDER BY in SQL Server 2000, so that they needed no ORDER BY when selecting from the view? And do you also remember what happened when SQL Server 2005 had an improved optimizer that understood that TOP 100 PERCENT when ordered by something is just everything, so that no ordering was required?


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Good one Hugo.........

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply