Cursor optimization

  • Hugo Kornelis (7/8/2008)


    GSquared (7/8/2008)


    Just to test this, I tried the following on a table with just over 1-million rows of data:

    (...)

    I had used fast_forward in the past, based on what BOL says, but I'm changing that now. Thanks.

    Hi GSquared,

    Great. That's what I wanted to achieve with this QotD.

    But do please read the rest of the thread as well. Someone pointed out to me that I only tested with tables that can be completely fit in cache. Once that changes, FAST_FORWARD seems to be the better choice (based on my, so far rather limited, tests).

    Yeah, a simple table of just integers isn't a really good test. Try adding a few TEXT (VARCHAR(MAX)) and/or IMAGE (VARBINARY(MAX)) columns along with a bunch of joins and see what happens. I'd also be curious to see the results with a fully loaded system (read: multiple simultaneous users) hitting the tempdb hard. You may lose that little bit of gain with I-O waits, etc. on the tempdb.


    [font="Arial Narrow"](PHB) I think we should build an SQL database. (Dilbert) What color do you want that database? (PHB) I think mauve has the most RAM.[/font]

  • Another aspect to solve your problem is...

    I warn you: you won't like this...

    DOCUMENTATION!!!

    I used to use a tool called Case Studio to document my database schemas.

    It still exists and does exactly the same but the price went from 50$ the second licence to about $500 since it was bought by Quest Software and renamed "Toad DataModeler".

    Having said that I always revert to my (documentation) sub schemas to remember tables and/or column names.

    For procedures, the text editor, associated with comments within the stored procs (and headers) does wonders. ๐Ÿ™‚

    I don't think any magic technology can replace documentation of one form or another...

  • Hugo Kornelis (7/8/2008)


    Christian Buettner (7/8/2008)


    In general an interesting question, but it is simplified a little too much. Cursors are such a complicated construct with so many nested options and dependencies on the requirements, so that general statements about optimal keywords can not be made in my opinion.

    Hi Christian,

    That is why I specifically asked what option is required for maximum performance. Of course, in real life you are usually confronted with lots of other requirements, making it a balancing act instead of a simple yes/no question. That's why real life isn't appropriate for something such as the QotD, and simplified situations are ๐Ÿ™‚

    As an example, the static keyword implies that you don't want to be able to see changes made after the cursor has been opened. fast_forward cursors are dynamic per default, which means that you can see changes made to the data during cursor operations.

    A FAST_FORWARD cursus is dynamic? Wow, I never knew that. Thanks!!!

    Hi Hugo,

    you are welcome. I did not know this by heart, I had to review BOL again.

    From my learning time towards certification, i remembered all those nasty aspects of cursors, and how easily they are misunderstood. Before my certification, I only knew the tip of the iceberg and still I am sure don't know every aspect of cursors.

    But I am of the opinion, that context is an important part of the QOTD. The danger with no context is that a general rule is derived from specific scenarios. Most QOTD readers probably are not too familiar with the topics and might get away with a wrong impression. (Actually this seems to be a big problem of the internet. There are too many "half" truths out there that cause so many misunderstandings).

    But hats off to you. I highly appreciate your decision to post this QOTD. Especially with this topic, I could not have done it any better. But I actually have a suggestion to make, although it is directed more to Steve:

    Would it be possible to create a second QOTD user group that is able to see QOTDs one day before the other group?

    This first group can then test drive the questions (but get points as usual) and Steve can filter or fix the questions if necessary before Group 2 comes into play.

    The users should be able to decide which group they want to be in:

    In group one they can showcase their skills by correcting wrong QOTDs a

    In group two they can focus on learning from the QOTDs

    Best Regards,

    Chris Bรผttner

  • Christian,

    Is it not "part of the fun" to sometimes get it "not entirely right"?

    I actually find the debate as interesting if not more interesting than the original question and/or answer.

    It would be sad to deny any particular group the benefit of these discussions.

    ...And it makes me feel good to see I am not the only one making slight mistakes from time to time ๐Ÿ˜€

    Eric

  • Having submitted 4 QOD and struggling to insure that they were clearly worded I can understand the authors problem and appreciate all his work but before condemning BOL may I quote from the blog entry

    Disclaimer: All results presented here are only valid for my test cases (as presented below) on my test data (a copy of the SalesOrderDetail table in the AdventureWorks sample database), on my machine (a desktop with 2GB of memory, a dual-core processor, running SQL Server 2005 SP2), and with my workload (just myself, and only the test scripts were active). If your situation is different, for instance if the table will not fit in cache, if the database is heavily accessed by competing processes, or if virtually any other variable changes, you really ought to perform your own test if you want to squeeze everything out of your cursor. And also consider that many options are included to achieve other goals than performance, so you may not be able to use all options without breaking something.

    Considering the above I would still say that BOL statement concerning Fast-Forward should have been an acceptable answer or the question should have had as a choice "It Depends"

    If everything seems to be going well, you have obviously overlooked something.

    Ron

    Please help us, help you -before posting a question please read[/url]
    Before posting a performance problem please read[/url]

  • But do please read the rest of the thread as well. Someone pointed out to me that I only tested with tables that can be completely fit in cache. Once that changes, FAST_FORWARD seems to be the better choice (based on my, so far rather limited, tests).

    Hugo, good question leading to a worthwhile debate. Now I have a way to speed up all my TSQL code! (RBAR squad - thats a joke.):)

    One question - on a production server surely you cannot know how much cache is available to a query and anyway it is a variable, so whether static or fast_forward is best could alter? Would a good rule of thumb be if the columns used in the cursor are small, use static, if large use fast_forward, or is that too simplistic?

    I guess only way if in doubt is to test under production conditions.

    ---------------------------------------------------------------------

  • Nice question.....:)

  • I expected to get this wrong. What a surprise when I saw that STATIC is correct. Then I noticed it was Hugo who posted the question. Hugo has a great article on this topic.

    http://sqlblog.com/blogs/hugo_kornelis/archive/2007/11/21/curious-cursor-optimization-options.aspx

    Jamie

  • What about the other BOL article that says: "Dynamic cursors open faster than static or keyset-driven cursors" . All other recommendations in this article are to use the default settings for performance

    http://msdn.microsoft.com/en-us/library/ms187859.aspx

    SQL Server 2005 Books Online (September 2007)

    About Choosing a Cursor Type

    Rules for Choosing a Cursor Type

    Some simple rules to follow in choosing a cursor type are:

    Use default result sets when possible. If scrolling is needed, it may still be more efficient to cache a small result set on the client and scroll through the cache instead of asking the server to implement a cursor.

    Use the default settings when fetching an entire result set to the client, such as when producing a report. Default result sets are the fastest way to transmit data to the client.

    Default result sets cannot be used if the application is using positioned updates.

    Default result sets must be used for any Transact-SQL statement or batch of Transact-SQL statements that will generate multiple result sets.

    Dynamic cursors open faster than static or keyset-driven cursors. Internal temporary work tables must be built when static and keyset-driven cursors are opened, but they are not required for dynamic cursors.

    Regards,Yelena Varsha

  • emamet (7/8/2008)


    Christian,

    Is it not "part of the fun" to sometimes get it "not entirely right"?

    I actually find the debate as interesting if not more interesting than the original question and/or answer.

    It would be sad to deny any particular group the benefit of these discussions.

    ...And it makes me feel good to see I am not the only one making slight mistakes from time to time ๐Ÿ˜€

    Eric

    Hi Eric,

    well, actually it is not really part of the fun. But it's no big deal for me with regards to the lost points. I usually compensate this with complaining in the thread ๐Ÿ˜€ And yes, the following discussion often reveals much more information than any QOTD could do. But still I think it is important to leave as little room as possible for misunderstandings. Therefore I made also the suggestion to have two groups for the QOTD.

    And just to repeat: I would have not done a better job and highly appreciate people that submit QOTDs. Humans in the end are humans and they do make mistakes. Some make less, I make more... ๐Ÿ˜‰ Therefore it is so important to have some kind of proof reading. When I do architecture reviews at my job, I can easily spot errors and design flaws in code that I did not write by myself. In contrast, when I do review my code, I just overlook them more often.

    Best Regards,

    Chris Bรผttner

  • davidthegray (7/8/2008)


    Hugo, yes sure, I'm well aware about the risks of SQL injection. I always validate the input coming from the public web site, and I'm used to use parameters in most of the cases.

    Hi David,

    If this were my database, this would bother me a lot. Using parameters "most of the cases" means exposing yourself to injection attempts "in some of the cases". And validating helps prevent the more common and well-known injection attempts, but there will always be inventive hackers who think up new ways to circumvene the validation. Like the recent wave of injection attacks where the malicious code is injected in a binary encoded form.

    Really, AFAIC there is no reason at all not to use parameters for ALL user-supplied input.

    I'm wondering since several years why MS does not offer the possibility to group db object in folders, as it does with the other development tools

    Part of me likes the idea, part of me doesn't. The latter part is the part that reminds me that SQL Server is a server product, and that neatly organized folders is a client function. You could of course add some grouping functionality to Management Studio, but since it'd have to be stored on the client, you'd lose it when accessing the DB from another computer. Not sure if customers would appreciate such an implementation.

    However, if you think Microsoft should implement this, why not tell them so? https://connect.microsoft.com/SQLServer.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • JohnG (7/8/2008)


    Try adding a few TEXT (VARCHAR(MAX)) and/or IMAGE (VARBINARY(MAX)) columns along with a bunch of joins and see what happens. I'd also be curious to see the results with a fully loaded system (read: multiple simultaneous users) hitting the tempdb hard. You may lose that little bit of gain with I-O waits, etc. on the tempdb.

    Hi John,

    Good points. Unfortunately, I currently lack the time to set up a test for the former, and I completely lack all prerequisites to test the latter. However, if you (or anyone else) are able to conduct such tests, then please do so and share the results with me!


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • davidthegray (7/8/2008)


    Thanks for the answers to my question. Looks like in my db I have 23, among sp's and triggers, using cursors, all written several years ago, which makes about 1/10 of the total. In the recent years I mostly use dynamic SQL strings sent to the db by the client application (mostly ASP.NET), which is also deprecated, but I find that managing the catalog of objects in order was becoming unconfortable as the number of sp's was increasing. Otherwhise today I'd probably have 500 stored procedures in my db.

    How I'd like if MS would put a tree structure in the db object names, allowing us to create folders to keep tables, views and stored procedures reasonably ordered instead of having to scroll between hundreds of names when searching for something...

    Am I the only one longing for this feature? Sorry for going a bit off-topic...

    First, I don't consider 500+ stored procs all that unwieldy. Maybe I'm not in the majority (probably not; if I were, it would be a first!), but my main database a couple of years ago had 2394 procs in it. All were in use when that database was live. (Also had over 400 tables.)

    Second, you can achieve much of what you're looking for with a good naming convention.

    But yes, it would be nice to have a folder-tree type hierarchy for organizing database objects. Would be very nice. I wouldn't consider it a high priority request, but I'd certainly like it. Wouldn't even have to reside in the database, if Management Studio were set up to store the structure as an XML path just for that purpose. (That would also make it so different DBAs/Devs could have different "folders" if they so desired, or could share the XML file if they wanted the same folders.)

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Hugo Kornelis (7/8/2008)


    GSquared (7/8/2008)


    Just to test this, I tried the following on a table with just over 1-million rows of data:

    (...)

    I had used fast_forward in the past, based on what BOL says, but I'm changing that now. Thanks.

    Hi GSquared,

    Great. That's what I wanted to achieve with this QotD.

    But do please read the rest of the thread as well. Someone pointed out to me that I only tested with tables that can be completely fit in cache. Once that changes, FAST_FORWARD seems to be the better choice (based on my, so far rather limited, tests).

    I'll keep that in mind if I ever run into a cursor that can't fit in the cache. However, I think the only cursors I use are ones that are simply a list of database or object names, and I don't even use those very often. If I ever run into a server that can't fit all the database names into RAM, I think I'll have bigger worries than a few seconds plus or minus on a cursor in a maintenance script. ๐Ÿ™‚

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Christian Buettner (7/8/2008)


    Before my certification, I only knew the tip of the iceberg and still I am sure don't know every aspect of cursors.

    Hi Christian,

    I'd say that this is actually a good thing. Anyone who knows, or even thinks he/she knows every aspect of cursors, is either employed by Microsoft and working on the SQL Server team, actively maintaining the cursor functionality - or is in a desperate need to learn to use set-based code instead of cursors.

    Good SQL Server developers will hardly ever write a cursor. And thus, will never have the opportunity to find out much about them.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

Viewing 15 posts - 31 through 45 (of 65 total)

You must be logged in to reply to this topic. Login to reply