Multi Processor server maxing out on simple jobs.

  • Hi,

    Before I start, a little information and history. We had business objects on a very fast box (24 processors and 72GB memory) for test, and a shared test SQL server which limped along on 2 processors and about 6GB of memory. As the BOBJ box did very little being a 32 bit piece of software, a request was put to IT to allow SQL to be added to the same box, which was done. The main reason was the time it took to test jobs out on the Test SQL server which was up to 10 times the time for our live server (I know live and test should ideally be the same, but its not what we had).

    Our live SQL server is only a little better than our old test box and is shared with some processor intensive programs like message vault etc, so our belief was that moving the SQL server to the fast test BOBJ machine would enable testing to be done much faster than live.

    What we saw was that all the processors maxed out and the whole system would grind to a halt. We were able to do this with four lines of SQL code linking to two tables (which rules out BOBJ), but also the automated jobs that ran on the live server (remember not very fast) would max out and virtually stop on the new Test server.

    I don't have full access to the machine as my position is one of business intelligence and using SQL code rather than the back end, but our IT department has drawn a blank to the reason that this has happened. They have put in some changes which has basically limited it to 8 processors, which then max out and the job still fails.

    We run SQL 2005 on windows 5.2 R2. The SQL was installed from new, but the disks were disconnected and reconnected to the new server with a change to Directory from E: to F: (san disks)

    Processors are 6 quad core and 72GB memory.

    Please feel free to throw anything into the hat, as I have no idea what is causing this and I am prepared to check as much as I can gain access to or pass other bits to our IT department if I don't have the access.

  • How much max memory allocated to SQL Server instance?

  • NOT using AWE to allocate memory

    Minumum server memory (in MB): 0

    maximum server memory (in MB): 60000

    Index creation memory (in kb, 0 = dynamic memory): 0

    Minumum memort per query (in KB): 1024

  • Strange!!!

    How long it takes to run same on Production? OR is there significant change in test and prod instance data? May be someone has populated/doubled the data on test for testing which you are not thinking about?

    Have you tried profiler to identify, if any other thing is not pinching the resources?

    Cheers.

  • We have had a backup of live restored to test for the data so that the data is exactly the same and run the same jobs.

    A job that will take 30 mins on the smaller live server is taking around 2 hours to complete on the test (if it does complete) and a 2 hour live job is basically hanging after about 1 hour.

    I have heard that another part of the business is having similar issues with their SQL server and a goldmine front end, though that one is SQL 2008 so I am now starting to think that it is some policy or piece of software that is on each machine, which could be something like the antivirus system and setup, or something else.

    I have gone past the stage of loosing sleep on this. I had a great nights kip last night ready to try and hit the issue today <he says, though really not expecting any change> 😉

  • SOOOO many things could be at play here - you likely need to get a professional involved for a few hours to identify what the root cause is. Here is a quick list that is not even close to being exhaustive:

    1) IO stack misconfig (or just not sufficient)

    2) Memory contention between BOBJ and SQL Server leading to OS paging

    3) Restrict SQL Server MAXDOP or possibly even use CPU affinity (VERY advanced topic there)

    4) do a file IO stall and wait stats analysis to see what is going on from those two standpoints

    5) tempdb issue

    6) blocking - use sp_whoisactive during execution to see lots of stuff

    7) bad plan due to out of date statistics?

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply