Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Replacing Cursors and While Loops


Replacing Cursors and While Loops

Author
Message
Maurice Pelchat
Maurice Pelchat
Grasshopper
Grasshopper (16 reputation)Grasshopper (16 reputation)Grasshopper (16 reputation)Grasshopper (16 reputation)Grasshopper (16 reputation)Grasshopper (16 reputation)Grasshopper (16 reputation)Grasshopper (16 reputation)

Group: General Forum Members
Points: 16 Visits: 51

The problem is not really in the cursor itself. It is in the fact that operations are not set-oriented. Doing 1 million inserts is not the same at all as doing a single insert based on a select that returns 1 million rows and insert them in a single operation.

I converted recently a transact-sql batch that was doing about 8 million inserts. The execution time was about 8 hours. By making a big query using derived tables and case statement to apply some conversion/extraction logic in the query, and union all to bring together rows obtains under disjoin logic context, this time reduced from 8 hours to 30 minutes (a 16 to 1 improvement). By using such query it was also possible to reduce logging by doing a select INTO.

Even without doing a SELECT INTO, SQL Server optimize log writes operations. It seems that there is a lot less logging required when doing such massive single insert than when doing it row by row.

Cursors are an easy way to describe a job, but the net result is that it ends in row by row reads & writes. For example a single row read use almost all the time suitable indexes, it they exists, which may seems good. But it is very not good if you finally read all the rows from the table this way. When you match a lot of rows using a join SQL Server may evaluate that using an index is not good, and the match is going to be done with a better suited joining algorithm for this type of job. There is another optimization gain possible there.

The only problem with this approach, is that you have to work harder to define the big “do it all” query.


Clinton Herring
Clinton Herring
SSC Journeyman
SSC Journeyman (90 reputation)SSC Journeyman (90 reputation)SSC Journeyman (90 reputation)SSC Journeyman (90 reputation)SSC Journeyman (90 reputation)SSC Journeyman (90 reputation)SSC Journeyman (90 reputation)SSC Journeyman (90 reputation)

Group: General Forum Members
Points: 90 Visits: 12
This doesn't pivot the results. I wanted the the data from several matching rows in the Many table to come back as a single value of data separated by commas in the results table.

Suppose I had the following Many table...

ID DATA
-- ------
56 run
56 jump
56 scale

After the pivot the results data would look like this...

ID Result
-- ----------------
56 run, jump, scale



Kenneth Lee
Kenneth Lee
SSC Journeyman
SSC Journeyman (81 reputation)SSC Journeyman (81 reputation)SSC Journeyman (81 reputation)SSC Journeyman (81 reputation)SSC Journeyman (81 reputation)SSC Journeyman (81 reputation)SSC Journeyman (81 reputation)SSC Journeyman (81 reputation)

Group: General Forum Members
Points: 81 Visits: 1

After the pivot the results data would look like this...

ID Result
-- ----------------
56 run, jump, scale

No, not really, it should look like:

ID Result
-- ----------------
56 run,jump,scale,

That's 15 characters long, and if the return statement was:

return substring(@value,1,LEN(@value)-1)

you could drop the comma at the end. That's assuming your string including the last ',' character was 20 or under characters long.


Kenneth Lee
Kenneth Lee
SSC Journeyman
SSC Journeyman (81 reputation)SSC Journeyman (81 reputation)SSC Journeyman (81 reputation)SSC Journeyman (81 reputation)SSC Journeyman (81 reputation)SSC Journeyman (81 reputation)SSC Journeyman (81 reputation)SSC Journeyman (81 reputation)

Group: General Forum Members
Points: 81 Visits: 1

I've had similar experiences with improving cursors. I had to look at someone's report because they weren't getting good data from it anymore. I ran the existing proc on my machine just to see what it would produce. 5 minutes later I hadn't gotten a result, and I had started to read what they were doing and understood why they weren't getting fast results, but still didn't understand what they were trying to do. 10 minutes after that I understood what they were trying to do, their proc is still running. I open up a new window and start getting answers piecemeal.

If I execute this select I'll get all the users they are interested in. Yep. If I join this information I'll find out the offices they are involved in. Yep. Hmmm, how am I going to include all the related offices. Oh, this should do it, but this could loop indefinitely, before I start this, put a different stop in the loop. (Came in handy when I blew the data comparison later and caused the infinite loop.)

Anyway, by the end of the day, I had reproduced the report information (including the information the original proc was dropping.) that they wanted. I left my workstation up overnight because I still hadn't gotten an answer from my machine in 6 hours of running.

Next morning I killed the 20 hour process, wrote up a new proc and got a report in 30 seconds. I heard the old processes ran in 4 hours on the faster production machine for 1 DB and 2 hours for another DB. (Custom processes for each DB.) My proc gets reports from both DBs in 28 seconds.


Clinton Herring
Clinton Herring
SSC Journeyman
SSC Journeyman (90 reputation)SSC Journeyman (90 reputation)SSC Journeyman (90 reputation)SSC Journeyman (90 reputation)SSC Journeyman (90 reputation)SSC Journeyman (90 reputation)SSC Journeyman (90 reputation)SSC Journeyman (90 reputation)

Group: General Forum Members
Points: 90 Visits: 12
You are correct for the code I listed in the article. I did change it in the final results here at work to remove the last comma.



Clinton Herring
Clinton Herring
SSC Journeyman
SSC Journeyman (90 reputation)SSC Journeyman (90 reputation)SSC Journeyman (90 reputation)SSC Journeyman (90 reputation)SSC Journeyman (90 reputation)SSC Journeyman (90 reputation)SSC Journeyman (90 reputation)SSC Journeyman (90 reputation)

Group: General Forum Members
Points: 90 Visits: 12
All,

I just yesterday used the techique at a contract client with similar results although not quite as dramatic as the one I wrote about. To be fair though, there is a limit to how well this may work. After I wrote the article I came across another update that was being handled with a cursor. Within the cursor 18 columns had to be updated while pivoting data. I found I only needed 7 functions to perform the 18 updates but by the time all thoses calls were made for 2.2 milllion records, the new process clock out at the same speed as the old one. I tested individual calls and they were very short but that many had an accumulative affect. The good side though is that the code is so much simpler and easier to read that I plan to replace the old code anyway.

Have fun building your functions.



Jeswanth Jaishanker
Jeswanth Jaishanker
SSC Veteran
SSC Veteran (248 reputation)SSC Veteran (248 reputation)SSC Veteran (248 reputation)SSC Veteran (248 reputation)SSC Veteran (248 reputation)SSC Veteran (248 reputation)SSC Veteran (248 reputation)SSC Veteran (248 reputation)

Group: General Forum Members
Points: 248 Visits: 172

Happend to see this while browsing . thought it might be relevant here. have a look

http://www.sqlteam.com/item.asp?ItemID=2368

Jeswanth



--------------------------------
Phil.Nicholas
Phil.Nicholas
SSC-Enthusiastic
SSC-Enthusiastic (123 reputation)SSC-Enthusiastic (123 reputation)SSC-Enthusiastic (123 reputation)SSC-Enthusiastic (123 reputation)SSC-Enthusiastic (123 reputation)SSC-Enthusiastic (123 reputation)SSC-Enthusiastic (123 reputation)SSC-Enthusiastic (123 reputation)

Group: General Forum Members
Points: 123 Visits: 10
Just a couple of things I've found useful. If you have to loop generally it's quicker to use a while loop selecting each record individually rather than a cursor. Also if you find you have very large SQL statement it often runs much quicker broken up into smaller ones. If each calculation can be broken up into individual set based queries I'm sure you'll improve the performance.


Phil Nicholas
Mark Bolen
Mark Bolen
Forum Newbie
Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)

Group: General Forum Members
Points: 2 Visits: 5

I hope there are not many people hitting this table. This may be OK as long as it is a nightly run that you know you are the only code running against it.

However long it runs, it will be locked during this process. Although fast, for databases with a lot of people expecting to update it, it would be unacceptable unless you can get it to run in less than 4 seconds (you last said 26 seconds I believe) and lock timeout's are set well above that. Speed is not always the only consideration unless you are running exclusively! In which case I would put a table lock hint on it to speed it up even more. I would recommend taking a little longer time to make sure you don't have a table lock for a half minute.

Mark


Mark Bolen
Mark Bolen
Forum Newbie
Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)

Group: General Forum Members
Points: 2 Visits: 5

Here is how a highly available database would look. I work on databases with hundreds of thousands of users on databases unconceivable large to most. At my job location, we don't even have a nightly processing time (global app).

Whenever I read posts on this and most other sites, I have to keep in mind that they have a low concurrency of users. As a user base grows, most will have a an increasing number of problems they can't identify over time.

The following code is focused on removing cursors, actually a lot of code should be added for error checking, but for simplicity and keeping to the topic it is left out. While loops in our environment are a necessary evil to keep the number of rows being updated down to a small number so they can complete and release locks within 4 seconds (our apps rule; should actually be lower; perhaps 3). Think availablity first, then speed. At my work location, we even have variations of this that loop through rows in blocks of 5, 10, or whatever higher number still allows us to complete in less than 4 seconds. As you can see, it is a completely different mind set.

--------------------

Declare @tmp_id int
Declare @prev_id int
Declare @tmp_values varchar(20)

Set LOCK_TIMEOUT = 12000 -- assuming all code in app is tested to have
-- no update take longer than 4 seconds

Set @prev_id = -1 -- assumes all ID's are positive in nature

While Exists (
SELECT TOP 1 @tmp_id = id from DB.dbo.OutPut_tbl WITH(NOLOCK)
Where id > @prev_id -- retrieve next row > than the previous
order by id -- key data ; preferable clustered ;
-- order by id to maintain
-- a reference point
)
BEGIN
--Direct column pivot into a variable
-- I don't know enough about the column_out field; but lets hope we never
-- increase in size greater than 20; declarations and table field would
-- need to be changed if that
-- was possible.

SELECT @tmp_values = @tmp_values + convert(varchar(20), column_data) + ','
FROM DB.dbo.many_tbl WITH(NOLOCK)
WHERE id = @tmp_id

UPDATE DB.dbo.OutPut_tbl WITH(ROWLOCK) -- assuming that this will take
-- less than 4 seconds
-- otherwise create another
-- while exists loop
SET column_out = Left(@tmp_values,20)
WHERE id = @tmp_id

SET @prev_id = @tmp_id -- move our pointer to the next row

END -- While Exists


Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search