CPU from 1% to 100% for 8sec query on Prod. server only

  • The production server is:

    Windows 2003 server

    1GB RAM

    DualCore processor (2.33 Ghz)

    The Server is used for the SQL 2005 database and an FTP server. It runs nothing else.

    Nothing has been installed on the server in months. In fact automatic Windows updates have been turned off to prevent ANYTHING from being installed.

    The server CPU normally runs at 0-1%. When I run a particular query, it takes 7-8secs and pegs the CPU to 100%.

    The SQLServer process, in Task Manager, shows 100%.

    Memory used by SQL Server stays around 250MB.

    I tried the same query on my computer, by taking the server database offline, and copying the database files to my machine. It runs in under a second with the CPU going to about 40%.

    History

    Around Sept. 1, they noticed a major slowdown in the computer and narrowed it down to this query. After two weeks I was notified and added 2 indexes to get the query from 8secs to 1sec.

    2 weeks later, I was called again to say that the query was locking up the computer.

    Since the query runs fast on my machine, I have to assume it has something to do with the production server. However, no hardware has changed, and no software has been installed.

    What else can I look at to see why the query is running so slow?

  • It's not unusual to have SQL spike the CPU for short periods of time.

    Have you checked the execution plans to see if they're the same? Rebuild statistics? Is the data set the same?

  • The data has to be the same since I took the production database offline, then copied the MDF and LDF files from the server to my computer, and then put both databases online.

    My understanding is that statistics are part of the database, so the method I used would result in identical statistics also. Besides, I haven't defined any statistics for the database, but if you think performing a statistics rebuild will make a difference, I will try that.

  • Everything should be the same if you copied the files.

    It couldn't hurt to rebuild them. Have you checked the execution plan?

  • First, thank you for taking the time to help me figure this out. I really appreciate your time and effort.

    I understand where you're going. You are focusing on optimizing the query itself. That was also my first thought.

    I know the query can be written better, but I don't see how that could cause it to take 8secs on another machine and less than a second on mine.

    Keep in mind that

    A.) It was fast for two years.

    B.) It became slow (took 8 seconds to run), seemingly all of a sudden.

    C.) I added indexes, which sped it up to around a second.

    D.) Two weeks later it slowed down again (takes between 7 and 8 secs)

    E.) It runs in under a second on my machine.

    I do not mind looking into anything that might solve this, as long as it isn't a temporary solution.

    My concern is that I already optimized the statement by adding indexes. And then 2 weeks later it slows to a crawl again. Let's say I do rewrite the query and add statistics and anything else that can make it as optimal as possible. And, even if it again runs in second, I still don't know what spontaneously caused it to slow down so drastically. And, what's to prevent it from spontaneously slowing down again?

  • It should be part of your maintenance plan, but when was the last time the indexes were rebuilt/reorganized?

  • Post the execution plan for the query. Let's see where it's slowing down and that may give a hint as to why the performance varies. Are you running the same version of SQL Server on your box as on the server?


    And then again, I might be wrong ...
    David Webb

  • It's possible that the data set exceeded a size and the query will no longer function well. There are lots of queries that do work fine for years and then data reaches a certain size and it needs rewriting or some changes to work well.

    My issue with the statistics is that a sub-optimal plan might be running, so a rebuild may help. It doesn't explain the difference between the boxes, which has me confused as well. A backup/restore might have differences, but I wouldn't expect a detach/attach to have anything different.

  • Over time the characteristics of the data that makes up the indexes could change. This can result in different execution plans. Based on what you're saying, I think this is possible. The interesting thing is, when this happens, it usually happens quick, not slowly over time. Because a given index is usually selective enough until it isn't and when it isn't, it isn't all at once.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Have you checked the disk subsystem for problems? Maybe a bad sector, maybe somebody has filled up the drives, maybe the disks are extremely fragmented causing a lot of disk thrashing.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • Is the query forcing recompiles? Or forcing the use of a specific execution plan each time (maybe the wrong execution plan)?

    This could still be a hardware or network issue. How is the proc being called? Through a webservice? Through a client-side program? How many times is the query being called in any given moment?

    Profiler or a Server-Side trace can help you determine this information during the slowdown. You running it once on your machine can go really fast. Running it multiple times at the same time on the server is a different ball of wax. Account for this behavior in all troubleshooting.

    Then do the following

    1) Check your server's paging file.

    2) Check your PerfMon SQL Server counters

    3) Check your PerfMon Memory & I/O counters

    Is there anti-virus software running against your .mdf and .ldf files? This one has caught a number of people and if someone keeps "un-excluding" your SQL Server files, could be the cause of the problem.

    How full is the hard drive? What other processes (check Task Manager for all users) are actually running while this is running. Nevermind that no other programs should be running. Task Manager does not lie and might be running lots more background processes than you expect.

    What about phantom connections to SQL Server? They could be consuming a lot of memory and processing power.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • Oh, and before I forget... Have the stats on the table been updated at all since inception?

    And just how long have all your services been running?

    I don't recommend restarting them until and unless you've eliminated all other possibilities, but could be that something in the OS has chewed up the cache and is therefore causing issues with SQL because it won't or can't clear out.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • To me the interesting thing is the difference in performance on the production server and on your machine.

    We know that the production server is a Windows 2003 server with 1GB RAM and DualCore (2.33 Ghz) processor, that it runs SQLS 2005 (which edition?), and that when this query is run the SQL Server store occupancy remains stable at about 250MB. But we know nothing about your machine - so it's impossible to comment on why the performance for this query may be different.

    Also, what is the physical layout of the .mdf file on the production server - is it scattered all over the disc, or is it in one contiguous chunk? Is the .LDF file on the same disc or on a different one? And so on - and how do all these physical things compare between the production server and your machine? And what else is happening (other queries running, backups being taken, ...) in the database at the same time as this query, and is all the same stuff happening on your machine?

    I think it probably that Grant is right and the data (size or distribution of values) has hit some sort of tipping point - but where such a tipping point is can often depend on physical characteristics of the environment and/or on other simultaneous queries, and here that surely must be the case since the performance is so radically different between two machines with the same data.

    Tom

  • RichardFM (10/12/2010)


    First, thank you for taking the time to help me figure this out. I really appreciate your time and effort.

    I understand where you're going. You are focusing on optimizing the query itself. That was also my first thought.

    I know the query can be written better, but I don't see how that could cause it to take 8secs on another machine and less than a second on mine.

    Keep in mind that

    A.) It was fast for two years.

    B.) It became slow (took 8 seconds to run), seemingly all of a sudden.

    C.) I added indexes, which sped it up to around a second.

    D.) Two weeks later it slowed down again (takes between 7 and 8 secs)

    E.) It runs in under a second on my machine.

    I do not mind looking into anything that might solve this, as long as it isn't a temporary solution.

    My concern is that I already optimized the statement by adding indexes. And then 2 weeks later it slows to a crawl again. Let's say I do rewrite the query and add statistics and anything else that can make it as optimal as possible. And, even if it again runs in second, I still don't know what spontaneously caused it to slow down so drastically. And, what's to prevent it from spontaneously slowing down again?

    How is the hard drive space?

    Do you have your data on one drive and the log on another? If you have low space on your log drive or on the drive where you house your system databases it is possible that it'll slow down any time it has to use the drive for temporary storage and processing.

    Has anyone changed the limits on how much RAM you can use on the server with Resource Governor? That can force the server to use the hard drive instead of RAM and that'll be slower.

    In a similar vein, if the amount of data has grown to the point that the query can no longer be done entirely in RAM, you'll have to hit the HD and it'll slow down.

    If you have plenty of space on your machine and access to all the RAM, it'll run well on your machine, but if the HD is full on the server, it'll run slowly there.

    --------------------------------------
    When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
    --------------------------------------
    It’s unpleasantly like being drunk.
    What’s so unpleasant about being drunk?
    You ask a glass of water. -- Douglas Adams

  • Also, what kind of logging are you using?

    --------------------------------------
    When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
    --------------------------------------
    It’s unpleasantly like being drunk.
    What’s so unpleasant about being drunk?
    You ask a glass of water. -- Douglas Adams

Viewing 15 posts - 1 through 15 (of 20 total)

You must be logged in to reply to this topic. Login to reply