sp_cursorprepexec slows down and down through the night

Question

sp_cursorprepexec slows down and down through the night

owl83

SSC Veteran

Points: 220
More actions
August 21, 2015 at 2:06 am

#320751

Hi everyone!
I've encountered strange behavior of server cursors.
We've got an application which reads data from MSSQL DB (MSSQL 2012) using cursors most of the times.
Each night we are running heavy data processings in it and sometimes having problem with one of these processing batches.
This processing is running via several parallel threads with different parameters for each one.
Parameters tells which goods the processing must take care of.
Sometimes one or several of such threads are slowing down. For example, instead of running 2 hours it is running 8-10 hours.
In very general, processing is going this way:
1. The app uses server cursors (sp_cursorprepexec, etc). We can't easily change this.
2. Underlying cursor queries contains params and each thread of the app prepares its own copy of cursor via sp_cursorprepexec.
Cursors are prepared with such parameters:
scrollopt = 0x10 (FAST_FORWARD)
ccopt = 0x0001 (READ_ONLY)
rowcount = -1
3. After that each of the threads uses it's cursor to retrieve data over and over by sp_cursorexecute.
Retrived data is used to calc some inventory and sales statistics and fill it to another table.
Depending on settings for current day sp_cursorexecute can be called over 100,000 times for specific cursor.
4. Each thread calls BEGIN TRANSACTION at the beginning, iterates through given goods, calculates needed data and calls COMMIT TRANSACTION at the end.
After launching some SQL-traces and investigation I've found that there are 5 such queries with cursors that causing processing slow down.
1. At the beginning of processing sp_cursorexecute is running fast, less than 30 milliseconds per time.
2. The farther it goes, the more it goes down slower and slower. Nearly 300 milliseconds at the end. Ten times slower!
3. Performance degradation is going gradually and proportionally for each of 5 cursors for the beginning to the end.
4. In trace there is and obvious dependency of Duration values for sp_cursorexecute on CPU values. They are getting high together all the way.
But there is no ascending of values in Reads column. It is keeping near 1000-2000 reads.
5. There are not critical overhead on SQL-server. CPU usage is 8-10%. There is no queue to disks and disk activity time is 20-30%
6. Activity Monitor in SQL Management Studio shows no significant waits. Only about 30ms per second for Buffer I/O and also about 30ms for Logging. I've seen it closer to the end of processing.
7. Each of 5 queries are using index hints on each of their tables. So it can't be an issue of not optimal query plan, I think.
8. Data in used tables can be changed while processing is running, but not critically. Less than 1% of all records. Each of tables contains 10-30 millions of records.
9. At least the end of processing there is no concurrent processes that can cause locks and waits, but it still slows down.
10. Data is pretty uniform for each of goods. So, queries affects gradually the same amount of data in the beginning and at the end.
One of the evidence of it is stable value of Reads column in trace.
Could you please advise me something in this case?

Viewing 9 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 1

owl83 (8/21/2015)
Hi everyone!
I've encountered strange behavior of server cursors.
We've got an application which reads data from MSSQL DB (MSSQL 2012) using cursors most of the times.
Each night we are running heavy data processings in it and sometimes having problem with one of these processing batches.
This processing is running via several parallel threads with different parameters for each one.
Parameters tells which goods the processing must take care of.
Sometimes one or several of such threads are slowing down. For example, instead of running 2 hours it is running 8-10 hours.
In very general, processing is going this way:
1. The app uses server cursors (sp_cursorprepexec, etc). We can't easily change this.
2. Underlying cursor queries contains params and each thread of the app prepares its own copy of cursor via sp_cursorprepexec.
Cursors are prepared with such parameters:
scrollopt = 0x10 (FAST_FORWARD)
ccopt = 0x0001 (READ_ONLY)
rowcount = -1
3. After that each of the threads uses it's cursor to retrieve data over and over by sp_cursorexecute.
Retrived data is used to calc some inventory and sales statistics and fill it to another table.
Depending on settings for current day sp_cursorexecute can be called over 100,000 times for specific cursor.
4. Each thread calls BEGIN TRANSACTION at the beginning, iterates through given goods, calculates needed data and calls COMMIT TRANSACTION at the end.
After launching some SQL-traces and investigation I've found that there are 5 such queries with cursors that causing processing slow down.
1. At the beginning of processing sp_cursorexecute is running fast, less than 30 milliseconds per time.
2. The farther it goes, the more it goes down slower and slower. Nearly 300 milliseconds at the end. Ten times slower!
3. Performance degradation is going gradually and proportionally for each of 5 cursors for the beginning to the end.
4. In trace there is and obvious dependency of Duration values for sp_cursorexecute on CPU values. They are getting high together all the way.
But there is no ascending of values in Reads column. It is keeping near 1000-2000 reads.
5. There are not critical overhead on SQL-server. CPU usage is 8-10%. There is no queue to disks and disk activity time is 20-30%
6. Activity Monitor in SQL Management Studio shows no significant waits. Only about 30ms per second for Buffer I/O and also about 30ms for Logging. I've seen it closer to the end of processing.
7. Each of 5 queries are using index hints on each of their tables. So it can't be an issue of not optimal query plan, I think.
8. Data in used tables can be changed while processing is running, but not critically. Less than 1% of all records. Each of tables contains 10-30 millions of records.
9. At least the end of processing there is no concurrent processes that can cause locks and waits, but it still slows down.
10. Data is pretty uniform for each of goods. So, queries affects gradually the same amount of data in the beginning and at the end.
One of the evidence of it is stable value of Reads column in trace.
Could you please advise me something in this case?

First let me say it is very nice to see a very detailed account of a problem! Second let me say that it is a BIG description and likely a problem that goes beyond the use of a forum to address.

Having said that, I had a number of thoughts as I proceeded through your descriptions. All were mostly dismissed by additional statements as I read on. These included:

1) (Re)compilation issues. You could try forced parameterization anyway to see if it helps. I have had GREAT success with that in the past on cursor-driven systems.

2) CPU

3) IO

4) locking/blocking

I am still betting for now on 1 or 4. What you must do at this point is a real-time analysis of what each spid is waiting on. Best thing for this is sp_whoisactive. Get it and study up on the parameters you can use and I am confident you can pinpoint the root cause here.

Page latching on the inserted-to table was not ruled out. It too would be shown with sp_whoisactive.

I note that tlog writes or even data file dirty page writes could also be an issue. Again small increased can kill you on repeated activities. Make sure there are NO other processes that kick off (via SQL Agent or any other mechanism either).

Oh, I would also do file IO stall and wait stats differential analyses, looking for cumulative issues during say a 3-5 minute run at various points throughout the process.

Note that this could likely be a difficult issue to nail down. You likely won't see huge numbers floating around. 30ms to 300ms still means less than 0.3 seconds per event increase. It is the FREQUENCY of events that is cumulatively killing you here - which is why RBAR is so bad in data processing.

If you can't figure it out, I will offer up my performance tuning consulting services. 😎

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

owl83 SSC Veteran Points: 220 More actions · Answer 2

Thanks for your reply, Kevin! I really appreciate that.

TheSQLGuru (8/21/2015)
1) (Re)compilation issues. You could try forced parameterization anyway to see if it helps. I have had GREAT success with that in the past on cursor-driven systems.

I'll try to capture query plans next week to be sure that plan is identical for all iterations. Maybe this will help to track something new that I haven't seen before.

TheSQLGuru (8/21/2015)
Best thing for this is sp_whoisactive.

Thanks, it seems to be a great thing!

TheSQLGuru (8/21/2015)
Oh, I would also do file IO stall and wait stats differential analyses, looking for cumulative issues during say a 3-5 minute run at various points throughout the process.

I'll try it. It surely more accurate method to watch waits than just staring to Activity Monitor.

Also I'm planning to split that single huge transaction to many small. Maybe it will help.

TheSQLGuru (8/21/2015)
It is the FREQUENCY of events that is cumulatively killing you here - which is why RBAR is so bad in data processing.

I completely agree

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 3

A) I don't think you understand what I meant with my (re)compilation issues statement. Not that plans are changing but that frequent (re)compilations are causing slowness. But I suppose plans could be changing too. There are specific waits associated with those so they can be seen in both real-time and aggregate analyses. I note that in my experience they are quite rare in the real world, except perhaps on grossly underpowered machines.

B) I realized that the big transaction does lend itself to locking/blocking issues.

C) Just thought of 2 new potential issues: file autogrowth and auto statistics updates firing. Depending on your IO picture either or both of those can cause "random" perf issues.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

Jeff Moden SSC Guru Points: 1004704 More actions · Answer 4

You talked about each cursor being prepared and executed. Have you verified that each one is being closed when completed?

Also, I'd like to recommend that you change these calls to a stored procedure and ensure that sp_ExecuteSQL is being used so that there's at least a chance of the execution plan being reused.

Last but not least, 1000-2000 reads per hit sounds a bit high to me. What is the code actually doing?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

owl83 SSC Veteran Points: 220 More actions · Answer 5

I've found workaround for this issue. I just splitted one huge transaction to many. One per iteration.

I tested it on saturday several times. Both one huge transaction and many small transactions.

That part of processing became 20 times faster with many small transactions.

Jeff Moden (8/23/2015)
You talked about each cursor being prepared and executed. Have you verified that each one is being closed when completed?

Yes, sp_cursorclose is being called.

Jeff Moden (8/23/2015)
Also, I'd like to recommend that you change these calls to a stored procedure and ensure that sp_ExecuteSQL is being used so that there's at least a chance of the execution plan being reused.

I can't do this easily. Such calls are managed by application kernel. It is closed area.

Jeff Moden (8/23/2015)
Last but not least, 1000-2000 reads per hit sounds a bit high to me. What is the code actually doing?

One of these cursors calculates stock quantity for given goods and walking through the result. It really can cause such amount of reads

Jeff Moden SSC Guru Points: 1004704 More actions · Answer 6

Ok. Glad you sussed it. I hope it doesn't come back as a problem in the future because it sounds like (and I could be wrong) a temporary fix to a permanent problem. Thanks for the feedback on your fix.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 7

Locking/blocking from a large transaction was the cause. Glad you had a simple fix! 20X improvement is AWESOME! :w00t:

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

owl83 SSC Veteran Points: 220 More actions · Answer 8

TheSQLGuru (8/24/2015)
Locking/blocking from a large transaction was the cause. Glad you had a simple fix! 20X improvement is AWESOME! :w00t:

Yeah, old code "is dark and full of terrors", but fortunately sometimes there is a fast way to fix it.