Strange behaviour

  • Hi all,

    Using SQL2K SP2 on Win2K SP4.

    Got a strange problem: Each night, we transfer all prod DBs to a reporting server, through a batch of scheduled jobs. The final job each night is to perform UPDATE STATISTICS.

    Each Monday morning, I get a user complaining about slow / no response from his report. The only thing I do is start the job which does the UPDATE STATISTICS. When it's done, his reports run fine.

    There is no activity on the server between the 4:30 a.m. run and the 9:30 run.

    On Tuesday this week, I had the same problem, again solved by running the job.

    This morning, however, no problems: the report came back as expected.

    I'm really at a loss as to where to start looking for the cause.

    Any ideas?

    P

  • As it stands it does sound baffling.

    I think I would be looking at reindexing rather than just updating stats, but maybe there are good reasons why you don't.  I would also check out scheduled "optimisation" jobs that might be de-optimising data (reorganising, shrinking...).  But none answers an intermittent problem solved in such a way.  I would also consider what is sitting in the memory.  We reboot most servers nightly to recover anything lost to leaks, and running a process may be loading up a cache so that performance improves afterwards.

  • When optimizing databases, you may use up space. Check space allocation and see if it will help.

  • Who said anything about optimizing? I described intermittent behaviour...

  • You probably need to execute sp_recompile for each table and sp_refreshview for each view on the database after the statistics have been updated.

    RegardsRudy KomacsarSenior Database Administrator"Ave Caesar! - Morituri te salutamus."

  • I don't see how that would explain the observed behaviour: slows right down on Monday and Tuesday, but not on Wednesday - Friday?

    It's the same processes running every night.

  • Does the Update Stats job fail ever?

    I agree that something strange is happening. How are you transferring the dbs?

    This is a separate server, right? Any restarts on it during the week? Looking for regular processes that might run on Wed-Fri and not Mon-Tue.

  • Nope, it's had a long run of successful completions.

    Using DTS Transfer DB task.

    Yes, separate server, no regular restarts.

    I will try one of the suggestions above, rebuilding the indexes, rather than updating the stats, and let you all know.

    Thanks!

  • Hmm, DBREINDEX has not changed anything.

    Interesting: the app calls a stored proc. The sproc took 40 seconds to run from the app, I cut and pasted the call from Profiler, and it ran in 4 secs -- a 90% improvement -- in QA.

    Ran the DBREINDEX again, and the results are now coming back in the app just as fast.

    I think I'll just schedule the DBREINDEX to 9 a.m., just before my user gets here...

    P

  • schleep, need more info on this.  Maybe you have solved it but here goes...

    Are the proc when called on a Monday slow only the first time they are called or through out the day? Also, is the database recieving data over the weekend? (I take it no users are retrieving reports over the weekend) If the db is having inserts and updates over the weekend, are you running the update stats every night? (If the db is not active over the weekend, why run the update stats job?)

    If the db is getting data over the weekend and you are updating stats (even if you are not updating stats), the query plans in the proc cache will be stale and possibly based on outdated stats.

    Rebuilding the indexes every night is overkill in most situations unless there are huge amounts of new data, and if that is the case, your maintenance window would possibly be too small to do the whole database over night. (If the db is for reporting, it will most likely benefit from a different indexing strategy from your production environment, and that usually means more (and wider) indexes than the production db, which will take longer to reindex)

    OK. So why the different response times on different days. Take a look at the patterns of data updates and inserts. I put forward the guess that most happen early in the week and therefore the statistics change drastically during this time.  Therefore the plans in the cache have a longer useful life when the stats settle down.

    I think you need to at least try what rudy suggested. At the very least clear the proc cache on a Monday morning.  OK, so the first person to run any given proc after this will have to wait for the proc to recompile, but this will be shorter than the time they wait for a poor query plan to execute.

     

    Failing that, get them to recreate the problem. Then say, "Works alright now!"

  • Dave,

    Thanks for the input, I'll get back to you next week on this, and have answers to your questions.

    P

  • From BOL:

    Recompiling Execution Plans

    Certain changes in a database can cause an execution plan to be either inefficient or invalid, given the new state of the database. SQL Server detects the changes that invalidate an execution plan, and marks the plan as invalid. A new plan must then be recompiled for the next connection that executes the query. The conditions that cause a plan to be invalidated include:

    Any structural changes made to a table or view referenced by the query (ALTER TABLE and ALTER VIEW).

    New distribution statistics generated either explicitly from a statement such as UPDATE STATISTICS or automatically.

    Dropping an index used by the execution plan.

    An explicit call to sp_recompile.

    Large numbers of changes to keys (generated by INSERT or DELETE statements from other users that modify a table referenced by the query).

    For tables with triggers, if the number of rows in the inserted or deleted tables grows significantly.

    -------------------

    See item 2?

    UPDATE STATISTICS invalidates all execution plans, so every report needs to rebuild it. It takes some time, long time if report is complex one.

    Consequent Report calls reuse that execution plans, so there are no problems with performance until next UPDATE STATISTICS has been performed.

    _____________
    Code for TallyGenerator

  • In response to questions above,

    Data comes in nightly, and occasionally on weekends. The distribution of incoming data over a week would be roughly 22% on Mondays through 18% Fridays, so no real spike early in the week.

    I leave the maintenance jobs running over the weekend because the decision to open weekends may be made after I've left on Friday.

    I've changed from an UPDATE_STATISTICS to DBREINDEX, which seems to have done the trick.

    I knew the execution plan would be invalidated, however that alone doesn't explain the 4 secs vs. 40 secs. Also, the sproc is called 10 times, with mostly the same parms -- so 10 different reports, all coming from the same sproc. Each call was running 40 secs.

    I admit that 10 reports from 1 sproc is not the way to go: the guy who wrote that nonsense was a just-get-it-to-work-with-bare-minimum-of-effort-and-

    hang-the-consequences-it-won't-be-my-problem kind of developer.

    He's no longer here.

    Thanks all, for your thoughts on this.

    P

  • So, I did a little experiment: I copied the calls to the stored procedure from Profiler -- 13 calls, with the parms varying slightly, but all based on the same date range.

    I ran that code in QA simultaneously with the report in the app. The app took 9-10 times longer for each call, except the very first, which took 25 times longer.

    Next, I re-ran the report from the app. No change.

    Next, I DBCC DBREINDEXed, then re-ran. Now, both run in the same time, which turns out to be about 30% faster than the original QA performance.

    I have the No Recompile trace flag set on the server -- I know, I know, but our most-used sproc, which is called about 9000 times an hour, was recompiling 73 times per execution, and that simply couldn't be allowed.

    I really want to know why the same call to a sproc can take 10 times longer depending which app issues the call.

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply