I guess I will concede to your view of cursors because I don't have time to take your challenge. I have read many articles on cursor vs set vs while loop. In each article, the cursor loses. I would say I can't conclude that in every case a while loop is better than a cursor, but I would say that there is always a way to turn something into a set based solution, even if you have to use # tables to do it. I would imagine this depends on how many rows you need to process. There's a great set of articles on cursor vs set (and ... maybe while loop, can't remember) done by Itzik Ben-Gan from SQL Server Mag a few months back. He basically finds a way to turn the schema into something that works with set.
If you can turn a row-based solution in a set-based solution, you (nearly) always gain performance. That is absolutely true. And Itzik is a true genius in finding creative ways to combine SQL Server's features to let it do amazing things. But I am sure that in some cases, an efficient set-based solution is simply impossible, even for the likes of Itzik.
I was trying to find articles on how a cursor actually works as far as memory allocation and other junk a cursor does, and I couldn't find any with the 10 or so minutes I have to reply. I did find an article that states a cursor locks the entire set of data it's set up to work with. A while loop does not do this. There's one benefit a while loop has. The article, although possibly not an authority on SQL - I have no idea, is here: http://www.sqlbook.com/SQL/Avoiding-using-SQL-Cursors-20.aspx
I have just finished reading the article you mention. I have no idea who wrote it either, but I can tell you that it is seriously flawed.
1) Cursors blocking tables? My foot!
Since two of the three major cursor types are designed to be able to respond to changes made to the table while processing the cursor, and the third one works under the cover very similar to the temp table/WHILE approach, I can assure you that a cursor should not lock the entire table, and definitely not during the entire processing. If you, or everyone else, ever saw that happening, the cause must have been either bad code, or a bug in the product. It just happens that I was involved in another discussion earlier this month, where someone else ("ramireddy") and I both posted some sample code to show the effect of cursor options on concurrent modifications. None of these involve blocking. The link to this discussion is below.
2) Efficient code? Nope.
The sample code presented is awful. The cursor code uses default options - I have already told you that those are bad news. Yes, you can beat performance of THAT cursor with a WHILE. Just like you can also go faster in a 2CV than in a Lamborghini - if you step out of the Lamborghini and push, but use the engine of the 2CV.
And the WHILE code is equally awful. No PRIMARY KEY on the table. This will result in a full table scan for each iteration of the loop. Try this with a ten-million row result set, and weep as your server grinds to a halt.
3) User-defined functions? Please not.
The "alternative" to use user-defined functions is almost worse. The optimizer has no choice but to call the UDF once for each qualifying row. And then access the table used in the UDF for that row. Performance will seriously suffer. For the case presented, the better alternative would have been to join the Customer table with a derived table or CTE that aggregates the Sales, and then use a CASE expression to find the discount.
I tried to check the forum discussion, but found the link to be broken. But please, don't take any of the presented advise serious. Not everything in that article is incorrect, but enough is to warrant discarding it as a whole.
Hugo if you could supply some links on how cursors work behind the scenes I would much appreciate it. It should be easy to judge if the cursor is more heavy handed than the while loop which is better to use.
In my previous post, I already gave the links to my two blog posts on cursors - again, they should be corrected, as the results presented are only accurate for queries that process less data than fits in cache.
The discussion I mentioned before, on Microsoft's MSDN forums, includes some background on the internals of query processing, and links to other articles with more background. (I don't agree for 100% with those articles, which is why I contributed to the discussion as well).
You'll find it at http://social.msdn.microsoft.com/Forums/en-US/transactsql/thread/13a38f34-ec01-48c5-928a-24d95bdd1fb4/
Note that some of the posts made at the start of the discussion are not entirely accurate, so please read the entire discussion.
That link takes me to an article about compression. But I guess that it would be off-topic for this discussion anyway. :-)
Good luck with all your other tasks. I can't really find fault with your priorities - but do save the links I provided, in case you ever find yourself in a sitution where you do need to implement row-based processing.
Hugo Kornelis, SQL Server MVP
Visit my SQL Server blog: http://sqlblog.com/blogs/hugo_kornelis