Server Crash

  • I manage a db with 30 users and I had a problem with my server crashing during db rebuild indexes. I have two db one at 2gb and the other at 10gb. both db have over 500 tables.

    The server kept crashing during rebuild index maintanence plan that i set up for ever saturday night. The server would crash once a month then it started happening once every week.

    I upgraded the firmware on the server and the crashing stopped for three months. Must of been something wrong with the server.

    Last saturday the system crashed again just like before. The only thing that i changed was to add a operation called Remote Report Processing. This module will allow the application to process reports on any designated server or workstation rather then on the user workstation. It requires the software to be open and the reportsvr to be operating. I set this up on the SQL server.

    I left it running over the weekend and the server crashed during db index rebuild. I went in on saturday turned off the feature and re ran the db maintenance plan and no problem.

    Could this have caused my problem?

    I watched perfmon during the maintanence plan and found the HDD peg over 1000. Is this normal?

    Any help would be nice.

    Thanks

    Jeff

  • Need some more information to really make a connection.

    When you say the DB crashed, can we get a little more detail on what you mean? (log filled up the disk drive, server froze, etc).

    What do the SQL Server error logs say was going on when the DB crashed? What does the Windows Event Log say?

    Are you using Enterprise Edition? If so, are you doing the rebuilds Online?

    Are there any other jobs that run at approximately the same time that could be interfering?

    What do you mean the HDD was at 1000? What is it normally at?

    Fraggle.

  • When you say the DB crashed, can we get a little more detail on what you mean? (log filled up the disk drive, server froze, etc).

    I arrive at the office and the server is completely un responsive. The lights are on but nobody is home

    What do the SQL Server error logs say was going on when the DB crashed? What does the Windows Event Log say?

    The event logs say it was running the maintenance plan and that is it. All event log are stopped until the server restarts. For example I get to the office at 9 AM Saturday morning, the maintenance plan runs at 2AM, and restart the server. The event logs show nothing between 2AM and 9AM. And the first event logs say "the server was shut down un expectantly at 2 AM.

    Are you using Enterprise Edition? If so, are you doing the rebuilds Online?

    Standard edition and off line rebuild

    Are there any other jobs that run at approximately the same time that could be interfering? NO i made sure of that

    What do you mean the HDD was at 1000? What is it normally at?

    I am talking about the Ave. Disk Queue length and it is normally at less then 10 even with all users hammering in data. During the rebuild index it goes all the way up to 1120.

    As i stated earlier I was have lots of problems with the server crashing, I ran the HP firmware update CD and they all went away. The maintenance plans where running perfectly every week for 3 months. I turned on the Remote Report Processing in the server and (out go the lights). I understand that Rebuild indexes are resource intensive and if anything tries to access the drives during the rebuild the server will crash. Just wondering if this is normal behavior.

    I am going to leave it off this weekend and see what happens. I am worried that I don’t have hardware or software configured correctly.

    Jeff

  • Over the weekend i did not run the Remote Report Server and everything was back to normal. I am not sure why this would cause a problem, but i am glad it is ok now.

    I sent an email to the software company to get some ideas.

    Thanks for your help.

    Jeff

  • Jayoub1,

    Sorry for not getting back with you sooner. Here are the two things I would recommend.

    1) Setup a perfmon counter that watches avg. disk queue, reads/sec, writes/sec, and transfers/sec. Then start with a medium size table and rebuild the index manually. See what happens. If you don't notice anything keep moving up to larger tables. Let me know what happens.

    Also, can you tell me how you have your data/log files organized (ie. are they on 1 physical drive vs 2 logical partitions on the same drive vs 2 drives). Also, what recovery model are you using?

    2) If you run the above and start to get some pretty bad disk issues, try turning off the thing that you turned on after the firmware upgrade and then try the above again. If it runs fine, they at lease we have norrowed down where to start looking.

    Thanks,

    Fraggle

  • This issue turn out to be an HP problem with the new Quad core CPU's communicating with the Hard drive controller. I ended having the problem again and then calling HP they found an artical where this was happing quite often. I had to update drivers and firmware and there was actually a Microsoft hotfix that also helped.

    Let me know if you need more information

    Jeff

  • Could you please provide the article from HP that pin points the issue ? We are having the same issue

  • I can send you the article, but it is in my office at work. I will send it to you on Monday if that is ok. Also, I may have make them into PDFs and may have to attach them, but can hopefully copy and paste.

    Jeff

  • thank you very much..appreciate quick response...will be waiting for documents:-)

  • Hi,

    Could you please post the article links ?

    Sincerely

  • Glad you were able to resolve your problem! Seems there really is a reason it is 'required' to make sure your firmware and drivers are topped off! I wonder if these patches were available when the server was first installed at your facility . . . that is definitely the time to make sure you are up-to-date.

    I also bet a quick review (using dumpchk.exe) of a mini dump file thrown during the crash wouldn't have pointed you in the right direction much more quickly. I have used that tool to get quickly pointed in the right direction in crash scenarios...

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • this is one article

    http://forums11.itrc.hp.com/service/forums/questionanswer.do?threadId=1184406

    this is a better article

    http://forums11.itrc.hp.com/service/forums/questionanswer.do?threadId=1185055

    this is the microsoft hotfix that i used.

    KB932755

    On May 2nd, 2009 I did the following:

    •Ran insight diagnostic for 3 loops and all components passed (3.5 hours)

    •Ran the firmware update that covered the following V8.40 February 2009 (30 minutes)

    I did BIOS update from V2008.09.30 to V2008.11.02

    I did Integrated Lights out V1.61 to V1.70

    I did Array controller V1.80 to V1.82 (separately, not found on Firmware CD)

    •Installed HP Support Pack V8.20 March 2009 and updated 22 drivers. (30 minutes)

    •Installed Microsoft Hot Fix KB 932755

    •Created a new image of the server

    Word of Caution, I beleive there is one driver on the V8.20 support pack that will cause event ID 59 to show up. I actually had to roll back to V8.15 to get rid of the issue, but i know that HP now has either a new support pack and or a fix for the event ID.

    Please let me know if what happens or if you need more help

    Jeff

  • Thankfully those are some old dates there and (hopefully) newer machines won't have those problems! I personally don't have much of a warm and fuzzy about HPs firmware and especially their iLO stuff.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Thanks Grasshopper. Good Stuff

  • Thanks, let me know if you need more information I went through hell with that server for serveral months. Once i thought it was fixed it would happen again.

    I always make sure that the server has the most current firmware and drivers when i build them and I also hope that the newer system dont have this issue because i am about to order another one for an exchange server.

    Jeff

    Jeff

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply