Eliminating Cursors

  • ksh (2/18/2010)


    what is being argued about? The tools were created to fulfill certain needs and are there to be used. Some tools are be better suited to certain situations than others.

    Those who state absolutes about technique ("I fire anyone who uses a cursor","I despise anyone who uses Dynamic SQL","Temp tables are a product of the dark side","I never talk to anyone who capitalizes the letter b on Thursday's") are usually just displaying their lack of understanding of the use of the technique or tool.

    FYI IMHO dynamic SQL rocks!!!!!! Have been using it for decades......

    Not denying that dynamic sql works and has it's uses - I have agreed to that now and my original statement I have taken back as it was not correct. The question right now (from me at least) is the security of it. I am maintaining that if not used in the right way it poses a large security risk. I am also trying to understand how you can use dynamic sql without potentially opening more data to the user than one may want to.

    One of the standards out there in application development is to use stored procedures for allowing the users to insert, update, delete or select data from a table. Thus you grant the user permissions to the stored procedure and not the table directly. This has many security benefits which can help stop things such as sql injection and data mining to just name a couple. However if I use dynamic sql in that same stored procedure I now have to give the user access to the same tables I was not wanting to give them direct access to in the first place.

    Below would be an example of probably the worst possible case of dynamic sql one could ever create and it should be obvious very quick what the security implications are. No one in the right mind should ever do this, but it gives a very distinct example of how dynamic sql can be bad if not used correctly:

    CREATE PROCEDURE up_dynamic @SQL varchar(1000) AS

    exec(@SQL)

    exec up_dynamic 'select * from master.dbo.sysprocesses;select * from sysobjects'

  • Here's what I think most people are missing on this matter:

    1. Dynamic SQL is not the problem, Injection IS.

    2. SQL Injection uses Dynamic SQL, but the two are far from synonymous.

    3. Dynamic SQL is just a facility used for Injection, Client code is another.

    4. It IS possible to write most needed dynamic SQL without Injection, and usually this is not hard.

    5. Dynamic SQL is as safe as anything else in SQL, *IF*, you never Inject user-supplied text into the SQL command.

    6. The principal means of doing 4 & 5 is to validate user-supplied text by replacing it with server-supplied text using some concrete table or function to generate valid-only text (in the context for which it is to be used).

    7. The permissions problem with dynamic SQL that you alluded to earlier is only a problem for owner-chaining permissions. The stronger and safer privilege elevation techniques of either Impersonation or Certificates do not suffer from this problem and are preferred for Dynamic SQL in any event.

    Thank you for that information.

  • Dynamic SQL allows for the creation of command(s) from a complex set of parameters/variables/conditions. When used properly it enables the centralization, extensibility and re-use of sp code.

    The SQL Inject "issue" is being addressed by others. FYI MS has some very good recent articles on the topic.

  • Kevin Rathgeber (2/18/2010)


    ...

    One of the standards out there in application development is to use stored procedures for allowing the users to insert, update, delete or select data from a table. Thus you grant the user permissions to the stored procedure and not the table directly. This has many security benefits which can help stop things such as sql injection and data mining to just name a couple. However if I use dynamic sql in that same stored procedure I now have to give the user access to the same tables I was not wanting to give them direct access to in the first place.

    Below would be an example of probably the worst possible case of dynamic sql one could ever create and it should be obvious very quick what the security implications are. No one in the right mind should ever do this, but it gives a very distinct example of how dynamic sql can be bad if not used correctly:

    CREATE PROCEDURE up_dynamic @SQL varchar(1000) AS

    exec(@SQL)

    I pretty much agree with everything that you are saying here. By way of demonstration, here is how you would write the above procedure to use Impersonation to solve the owner-chaining problem:

    CREATE PROCEDURE up_dynamic @SQL varchar(1000) AS

    exec(@SQL) AS 'NoPrivUser'

    Leaving aside the hideous case of Injection here, that "AS 'NoPrivUser'" will cause the EXEC(..) string to be executed as that user. Of course the owner of the 'up_dynamic' proc has to have sufficient rights to do this too.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • ksh (2/18/2010)


    Dynamic SQL allows for the creation of command(s) from a complex set of parameters/variables/conditions. When used properly it enables the centralization, extensibility and re-use of sp code.

    The SQL Inject "issue" is being addressed by others. FYI MS has some very good recent articles on the topic.

    What are the links for those articles?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • RBarryYoung (2/18/2010)


    1. Dynamic SQL is not the problem, Injection IS.

    2. SQL Injection uses Dynamic SQL, but the two are far from synonymous.

    3. Dynamic SQL is just a facility used for Injection, Client code is another.

    4. It IS possible to write most needed dynamic SQL without Injection, and usually this is not hard.

    5. Dynamic SQL is as safe as anything else in SQL, *IF*, you never Inject user-supplied text into the SQL command.

    6. The principal means of doing 4 & 5 is to validate user-supplied text by replacing it with server-supplied text using some concrete table or function to generate valid-only text (in the context for which it is to be used).

    7. The permissions problem with dynamic SQL that you alluded to earlier is only a problem for owner-chaining permissions. The stronger and safer privilege elevation techniques of either Impersonation or Certificates do not suffer from this problem and are preferred for Dynamic SQL in any event.

    Heh... I especially like observation #1 above. Might be time for you to write a new article, Barry. I've had people argue that any form of dynamic SQL is wrong because of potential security risks... I'd love to point them to a really good article on the subject with the simple instructions of "Read this and see why you're wrong."

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Heh. I did do a presentation on it. And I posted the presentation kit, including SQL code examples and Client code at my blog, here:http://movingsql.com/dnn/LinkClick.aspx?fileticket=pa1HXFdJ5Bs%3d&tabid=125&mid=911 But, I never did get around to writing the article I intended. 🙁

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Here is one of several of my Blog posts that talks about this: http://www.movingsql.com/dnn/Default.aspx?tabid=125&EntryID=190. In fact if you go my Blog site and click on the "Dynamic SQL" topic you'll find almost a dozen articles on this subject.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Very cool set of links, Barry. Great blog site on your part, as well! Other than the awesome content (haven't explored it all, yet, of course) is that it's easy on the eyeballs. There's just something to be said about a traditional white background with dark letters instead of the other way around.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (2/21/2010)


    Very cool set of links, Barry. Great blog site on your part, as well! Other than the awesome content (haven't explored it all, yet, of course) is that it's easy on the eyeballs. There's just something to be said about a traditional white background with dark letters instead of the other way around.

    Heh, yeah, Grant's wife really punctured my bloated ego about my previous site style, so I changed it to something a little more restrained. 😛

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Mark Hickin (1/17/2006)


    I think it's worth making the point that this approach does nothing to combat what is fundamentally bad about cursors. All we have done here is createda cursor, without using the declare cursor syntax.

    The article states:

    <However, this kills the whole theory of set-based operations, which are inherently faster than their row based counterparts. The following is one way of converting your CURSOR statement using the WHILE keyword in such a scenario>

    Yes, but this still isn't set based is it? It's still doing everything row by row, which is exactly what we want to avoid.

    Much better to take a step back and see if you process many, or all of the purchase orders in a small number of set based sql statements, and try not to take such a procedural approach.

    I'd also say that the 75% performance improvement is highly dubious. How can this be? Presumably, the work is being done within the usp_generate_purchase_order procedure. Are we really saying that the looping code was the bottleneck? If this process took 1 min originally, are we saying that by substituting the cursor for the while loop reduced it to 15 secs? I really don't think so.

    I'm not saying that this construct should never be used. There are many cases when you might want to slow things down (archiving/purging/data movingfor example) to aid concurrency and reduce the size of transactions, but this is not a good pattern to use in high performance database programming.

    I would guess that the difference was their cursor inspected more rows.

    Putting only the rows that needed operation into the in memory table and then running through the loop is the best option for any loop.

    I would wonder why not use CTE ( as much as I dislike them).

  • PHYData DBA (2/26/2015)


    Mark Hickin (1/17/2006)


    I think it's worth making the point that this approach does nothing to combat what is fundamentally bad about cursors. All we have done here is createda cursor, without using the declare cursor syntax.

    The article states:

    <However, this kills the whole theory of set-based operations, which are inherently faster than their row based counterparts. The following is one way of converting your CURSOR statement using the WHILE keyword in such a scenario>

    Yes, but this still isn't set based is it? It's still doing everything row by row, which is exactly what we want to avoid.

    Much better to take a step back and see if you process many, or all of the purchase orders in a small number of set based sql statements, and try not to take such a procedural approach.

    I'd also say that the 75% performance improvement is highly dubious. How can this be? Presumably, the work is being done within the usp_generate_purchase_order procedure. Are we really saying that the looping code was the bottleneck? If this process took 1 min originally, are we saying that by substituting the cursor for the while loop reduced it to 15 secs? I really don't think so.

    I'm not saying that this construct should never be used. There are many cases when you might want to slow things down (archiving/purging/data movingfor example) to aid concurrency and reduce the size of transactions, but this is not a good pattern to use in high performance database programming.

    I would guess that the difference was their cursor inspected more rows.

    Putting only the rows that needed operation into the in memory table and then running through the loop is the best option for any loop.

    I would wonder why not use CTE ( as much as I dislike them).

    Before anyone jumps into this with a reminder, yeah... we realize this is a 9 year old post. 😉

    Just pitching in my 2 cents here...

    As always, "It Depends". I've found that for control loops, it's usually not the loop that causes the problem. It the fact that people forget about things like table scans on the control table. We have a huge problem at work because people either do the SELECT TOP 1/DELETE WHERE in each iteration or they do the counter thing with a lookup on large control tables that aren't indexed. Each iteration causes a table scan. In the case of the SELECT TOP 1/DELETE WHERE the row with the RowNum was just used, you end up with the equivalent of a triangular join. On the counter thing, you end up with the equivalent of a square join (Cartesian Product) and then they wonder why the heck their code is so slow.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 12 posts - 286 through 296 (of 296 total)

You must be logged in to reply to this topic. Login to reply