There Must Be 15 Ways To Lose Your Cursors… Part 2: Just Put It in a S

  • DC (4/27/2009)


    I will say, one Cursor I haven't quite figured out how to eliminate, is when a vendor requires that something be done via a Stored Proc, and you want to do that something for each row in your result set.

    [font="Verdana"]See my earlier comment about this being a design issue: the vendor has designed their application to force a row-by-row approach because they don't know any better. So as you say, you don't really have a choice other than to use row-by-row.

    As an aside, I would code that in this way:

    while (1=1) begin

    select top 1

    @Id = Id,

    @Rcpt = Rcpt,

    @Msg = Msg

    from dbo.NewMessages

    where SentDT is null;

    if (@@rowcount = 0) break;

    exec dbo.SuperSecretSendNewMessageSP

    @Id,

    @Rcpt,

    @Msg;

    update dbo.NewMessages

    set SentDT = getdate()

    where ID = @Id;

    end; -- while

    Look Ma, no cursor! (Yes Jeff, I know it's just as bad as a cursor... it's still row-by-row.)

    I find the syntax of cursors annoying and most of the time you don't need one even to do row-by-row.

    [/font]

  • This is similar to how we handle it. You will get incredible performance increases by not using the cursor. However I do it a little differently, we use a counter based on whatever the key is for the table. That seems to generally add to 'debugability' and I don't think it slows down the loop (I haven't tested that though). I generally only work with tens of thousands of rows and it might make a noticable differenct on millions of rows to go this way.

    John

  • the vendor has designed their application to force a row-by-row approach because they don't know any better. So as you say, you don't really have a choice other than to use row-by-row. As an aside, I would code that in this way:

    I don't disagree about anything here. I just want to propose a possible alternative work around. What if he created a table valued function that ran the stored proc and returned the ID when relevant. It seems like he could then use a CROSS APPLY in an UPDATE statement and avoid even a While loop.

    But I don't know that this is any better than doing a WHILE loop. Other than a using CROSS APPLY in a very few DMV queries, I have no experience with it. I can't say if it is any better to run your own loop or let SQL Server do it with the cross apply. And I can't say if this idea is even possible/workable. It just seems like it should be. Something to think about.

    Maybe it will be addressed in a future article.

  • DC (4/27/2009)


    I will say, one Cursor I haven't quite figured out how to eliminate, is when a vendor requires that something be done via a Stored Proc, and you want to do that something for each row in your result set.

    It depends on what the stored procedure is doing. In instances like this I have read the vendor code and if possible, write my own set-based process that mimics the code.

  • JJ B (4/27/2009)


    the vendor has designed their application to force a row-by-row approach because they don't know any better. So as you say, you don't really have a choice other than to use row-by-row. As an aside, I would code that in this way:

    I don't disagree about anything here. I just want to propose a possible alternative work around. What if he created a table valued function that ran the stored proc and returned the ID when relevant. It seems like he could then use a CROSS APPLY in an UPDATE statement and avoid even a While loop.

    But I don't know that this is any better than doing a WHILE loop. Other than a using CROSS APPLY in a very few DMV queries, I have no experience with it. I can't say if it is any better to run your own loop or let SQL Server do it with the cross apply. And I can't say if this idea is even possible/workable. It just seems like it should be. Something to think about.

    Maybe it will be addressed in a future article.

    Can't execute a stored procedure inside a table-valued function. You can execute an extended stored procedure.

  • Jack Corbett (4/27/2009)


    It depends on what the stored procedure is doing. In instances like this I have read the vendor code and if possible, write my own set-based process that mimics the code.

    [font="Verdana"]Exactly. In which case you are no longer using the vendor API, and if they change the functionality of their API, you will also need to change your code to match. It creates an intimate coupling between the functionality of their code and your code, and you lose the advantage of having an API in the first place.

    Which points the finger squarely back at the poor design of the vendor API. You shouldn't need to circumvent an API to use it.

    [/font]

  • Bruce W Cassidy (4/27/2009)


    Jack Corbett (4/27/2009)


    It depends on what the stored procedure is doing. In instances like this I have read the vendor code and if possible, write my own set-based process that mimics the code.

    [font="Verdana"]Exactly. In which case you are no longer using the vendor API, and if they change the functionality of their API, you will also need to change your code to match. It creates an intimate coupling between the functionality of their code and your code, and you lose the advantage of having an API in the first place.

    Which points the finger squarely back at the poor design of the vendor API. You shouldn't need to circumvent an API to use it.

    [/font]

    Agreed. Usually in these cases it is a one time or rarely used solution that is so much faster it was worth the time, usually fairly short time any way.

  • Bruce W Cassidy (4/27/2009)


    Look Ma, no cursor! (Yes Jeff, I know it's just as bad as a cursor... it's still row-by-row.)

    I find the syntax of cursors annoying and most of the time you don't need one even to do row-by-row.

    [/font]

    Heh... you know me too well, Bruce.

    For something like running a RBAR proc, there are several solutions. Here's a list in the priority order I'd probably assign (it depends)...

    1. If it's an in-house proc, do an ROI and impact analysis and based on the analysis, fix it or not. If it's using a "Select" cursor, make sure that it's read only, forward only. Static isn't an automatic thing as Static makes a temp table in the background and the process of making that temp table may be more costly that replacing the cursor. And, replacing a cursor with a Temp Table/ While Loop combination has no performance or resource advantage over a good firehose cursor unless you just don't like Cursors and have a real passion to replace them. If you don't believe you can remove the RBAR backbone, then set it to do the least damage and walk away.

    2. If it's a third party proc, you might be stuck. If you can't get them to fix it and they don't respond to a healthy dose of pork chops, you might not be able to do anything about it without violating a product warrantee. Of course, if it's really a pig, then who cares about product warrantee? Do an ROI and impact analysis and follow the steps in 1 above. Make sure you get management approval to violate any product warrantee or learn how to flip pork chops and burgers for a living.

    3. If it's a new proc, there's virtually no excuse. Sure, there are some things that you might not be able to think of a set based solution for, but 99.9999% of the time, there is and it's usually worth pursuing. Personally, an explicit loop to send emails is probably one of those things that actually works just as well in a loop as the set based solution because the loop isn't the performance problem there. There is, however, a set based solution that I believe Barry is also aware of and I look forward to seeing if it comes out in the article.

    Someone said it in one of the posts... if it's just gotta be RBAR, why not write an app for it? Just keep in mind what I said... usually (ie. 99.9999% of the time), it doesn't have to be RBAR. Rewrite the proc you're trying to loop on to operate in a smart set based manner.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • JJ B (4/27/2009)


    the vendor has designed their application to force a row-by-row approach because they don't know any better. So as you say, you don't really have a choice other than to use row-by-row. As an aside, I would code that in this way:

    I don't disagree about anything here. I just want to propose a possible alternative work around. What if he created a table valued function that ran the stored proc and returned the ID when relevant. It seems like he could then use a CROSS APPLY in an UPDATE statement and avoid even a While loop.

    But I don't know that this is any better than doing a WHILE loop. Other than a using CROSS APPLY in a very few DMV queries, I have no experience with it. I can't say if it is any better to run your own loop or let SQL Server do it with the cross apply. And I can't say if this idea is even possible/workable. It just seems like it should be. Something to think about.

    Maybe it will be addressed in a future article.

    CROSS APPLY isn't much more than a fancy correlated sub-query with all of the same advantages and disadvantages... again, it depends. If the correlated sub-query or Cross Apply spawns the same rows more than once, then it's RBAR and it will have performance problems typical of many "Triangular Joins". In some cases, both methods can actually provide some performance advantages. In other cases, either can provide "Death by SQL". 🙂

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • And, sorry... I read past it in the previous post. Lynn is correct... you can only execute extended stored procedures in a function. But what good would it be, anyway? UDF's are basically RBAR and. like anything else, there are some good ones and there are some nightmares.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • greatheep (4/27/2009)


    So, the gauntlet is down - any suggestions on how to achieve the same benefits using set processing?

    In summary, we want:

    - If there is an error in a row, none of the various updates/inserts that are done should be kept - all should be rolled back.

    - Any rows which have no error should process completely, committing changes to the database.

    - We should be able to tell after the fact which rows processed completely, and which rows have an error, as well as the specific error each row experienced (they could be different for each row in error)

    I agree with other replies, without specifics it's hard to provide an exact answer, but I think a general approach can be suggested.

    It sounds like your staging table already has an error column. For the following general approach, you may want to either add a column for each possible error condition or create an error table with a many to one for each staging row. Which one you choose depends on the number of possible errors, what your reporting requirements are, etc.

    The overview for a set based solution is quite simple (especially because I don't know your specific requirements ;-)) and you probably already have many error checking code snippets in your cursor loop that you can re-use. Basically you run a series of set based updates on the entire staging table (or related error table) to flag the rows that match each error condition. Then you process all rows that don't have error flags set and report on the ones that do.

    Obviously this will not help for cases where you have concurrency issues because you can't screen for that. Once approach for concurrency problems would be to break the set of good rows into "chunks" and lump all inserts and the update which marks the rows as "complete" into the same transaction. If anything "errors out", the whole transaction will be rolled back. I'm sure you'll get lots of differing opinions on that approach, though. :w00t:

    Hopefully set based updates won't present as big a target for blocking locks, but depending on your server and other concurrent processes, 1.5 million rows at a time might be a big bite to chew.

  • Andrew L. Smith (4/27/2009)


    Barry:

    I agree with this for the most part, but have an exception that I would like to see if you have an answer for. For example, I need to receive a number of records from a Message Queue and perform (or not perform) a stored procedure, passing information from the record read to the stored procedure. I am not updating anything and I do not need the combined results of the cursor. Can you suggest a way to accomplish this without a cursor? (I am actually passing the data to a stored procedure that e-mails me an alert that certain events have occured on the database server.)

    Regards,

    Andrew Smith, SQL Server DBA

    You've got my curiosity up, Andrew. What does the stored procedure do that it needs to work on only a single row at a time? Is it a 3rd party proc?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I've recently used CURSOR in one of my application (unwishingly though). We were working on a survey application and we collected the data of our web forms in our custom entity objects and finally xml serialized them on a button's click and passed the entire xml to a stored procedure which traversed and read the xml and inserted the data into the actual database. Now the structure of the xml was something like this

    <Questions>

    <Question type="1">

    <text>how are you</text>

    <cretionDate>2008-12-3</cretionDate>

    <QuestionDetails>

    some more nodes........

    </QuestionDetails>

    </Question>

    <Question type="2">

    <textvFeeling Good</text>

    <cretionDate>2008-12-3</cretionDate>

    <QuestionDetails>

    some more nodes........

    </QuestionDetails>

    </Question>

    </Questions>

    How could i prevent the use of cursor???

  • This is a really well written article RBarryYoung, my compliments.

    Even Jeff learned something here 🙂

    The strong case for this article is that you demonstrated with proper examples that gaps in SQL knowledge causes developers to stick with familiar constructs. For those with procedural backgrounds it is natural to come up with a solution that involves cursors.

    Similarly I seen procedural constructs in the past that do simulated joins between two tables. Where first one set is read and then for every row in that set a new statement is executed to get rows from the table to be joined. I personally named this the PHP paradigm as that is where I found this sort of code. The cause is likely due to the popularity of the language with untrained and beginning programmers. The language domain is the web and learning by example results in more HTML knowledge then architecture and SQL. Subsequently this group has only sparse knowledge of databases and SQL, if any.

    The similarities are striking!

  • Hello,

    I recently had to use curosr to do a function in my work. Although I knew I should try to avoid cursor, but could not find a way. I wonder if you can figure out doing it without cursor.

    I have a table like this:

    Name RowNumber

    RL 1

    RL 2

    ..

    RL 10

    SL 11

    SL

    .. 15

    RL 16

    ..

    RL 30

    SL 31

    .. 41

    The RowNumber is granteed to be in sequence and there are actually more than two values for the 'Name' column, only two listed here for simplicity.

    The result of the function I need is a return table varaible like

    Name startRowNumber endRowNumber

    RL 1 10

    SL 11 15

    RL 16 30

    SL 31 41

    I am really curious if there is a way to achieve it without using cursor.

    Thanks.

Viewing 15 posts - 61 through 75 (of 316 total)

You must be logged in to reply to this topic. Login to reply