December 9, 2011 at 4:51 am
the database grows by 1Gb a day and has done for the last couple of years, so nothing new on that front. The server isn't virtualised.
It doesn't even seem to the be the same queries that are blocked each time. I am trying to apply indexs now that the performance dashboard is recommending but due to the blocked processes it is taking forever.
December 9, 2011 at 4:53 am
SAN? Has the SAN admin changed anything? Is the switch dedicated or shared? If shared, it is overutilised?
Are you sure nothing changed on the hardware?
Anything in the SQL error logs or windows event logs?
Have you considered getting a consultant (a bloody good one) in to assist? This is probably easier to solve on-site than via forums.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
December 9, 2011 at 4:58 am
steve.clark 39713 (12/9/2011)
the database grows by 1Gb a day and has done for the last couple of years, so nothing new on that front. The server isn't virtualised.It doesn't even seem to the be the same queries that are blocked each time. I am trying to apply indexs now that the performance dashboard is recommending but due to the blocked processes it is taking forever.
If pending io requests are continuously around 80-90, I think its way too high for a good IO subsystem. Some thing is definitely messed up there.
Ask your SAN administrator to check it once again.
Have you distributed your database's files on different disks or they all are sharing same disk?
P.S. If there is any perticular large table which is missing some indexes & which is heavily used in queries, it can also bring the performance down.
Creating those indexes might help but your I/O subsystem has got some problem.
December 9, 2011 at 4:59 am
Taking stabs in the dark here.
Isolation level changed?
+1000 on the bloody good consultant idea. Might be nothing, might be a whole lot harder than nothing to find to.
#1 option for me assuming you definitely didn't change anything at all whatsoever on that server would be to do down this road =>
You might have a heavy hitter query that went from 1 ms to 500 ms and killing the server in the process.
That's the only sudden change I can think of aside from any setting / patch / software change.
December 9, 2011 at 5:02 am
Divine Flame (12/9/2011)
steve.clark 39713 (12/9/2011)
the database grows by 1Gb a day and has done for the last couple of years, so nothing new on that front. The server isn't virtualised.It doesn't even seem to the be the same queries that are blocked each time. I am trying to apply indexs now that the performance dashboard is recommending but due to the blocked processes it is taking forever.
If pending io requests are continuously around 80-90, I think its way too high for a good IO subsystem. Some thing is definitely messed up there.
Ask your SAN administrator to check it once again.
Have you distributed your database's files on different disks or they all are sharing same disk?
P.S. If there is any perticular large table which is missing some indexes & which is heavily used in queries, it can also bring the performance down.
Creating those indexes might help but your I/O subsystem has got some problem.
On the IO shot dead side of things. I've seen AV, backups, restore, reindexes be able to cause something similar to this. Maybe you have 2 at the same time + san network card misconfigured killing IO access. I've had a case like this where the card was a 1 GB but for some reason was running at 100 mbps.
re-+1 on the consult. You need someone like Gail or Brent Ozar or SqlSkills.com to get in your system to fix it.
December 9, 2011 at 5:16 am
Ninja's_RGR'us (12/9/2011)
Taking stabs in the dark here.Isolation level changed?
I suffered this issue few months back. The application developers uploaded a new build & they explicitly set the isolation level to serializable in the .Net code itself. It was one hell of the blocking on the server all of a sudden :hehe:.
Funniest thing was they didn't tell me this themselves until I found it myself & questioned them about it.
As soon as that was corrected in the code, the performance came back to normal:satisfied:.
December 9, 2011 at 5:35 am
If you cannot relate to a change at application side, or db server software level side, I would double check all physical drives this server is connected to.
We've had a case where a server only had a number of raid1 drives and noticed a drastic slow down but received no error messages, alerts or indications at all until we actually went to check the box and heart strange noises.
Unplugged the drive which we found making most noise and the server got back up to speed.
Of course the disk was replaces and the raid rebuild has been run, but the server now runs at the same paces as before this incident.
Johan
Learn to play, play to learn !
Dont drive faster than your guardian angel can fly ...
but keeping both feet on the ground wont get you anywhere :w00t:
- How to post Performance Problems
- How to post data/code to get the best help[/url]
- How to prevent a sore throat after hours of presenting ppt
press F1 for solution, press shift+F1 for urgent solution 😀
Need a bit of Powershell? How about this
Who am I ? Sometimes this is me but most of the time this is me
December 9, 2011 at 6:40 am
Checked isolation level at that hasn’t changed
Checked the server logs and there are no errors at all. Is there any other way I can check the SAN is ok?
Consultant will probably be a last resort that we try Monday next week. The business is looking to send home staff and give us tonight to reoslve. There is a Saturday shift in tomorrow.
will look at doing this
this afternoon
On a plus note the number of blocked processes is now around 40 (halved). I think this is more to do with user level though.
December 9, 2011 at 7:05 am
steve.clark 39713 (12/9/2011)
Is there any other way I can check the SAN is ok?
SAN won't reflect in the server logs. You need to run SAN diagnostic and check the SAN logs.
Absolutely no errors in the SQL error log, windows application or system event logs?
Honestly, I doubt that performance tuning article is going to help much. This kind of sudden break is not usually the result of slow queries, that causes slow degradation that's visible for some time. Absolutely fine to completely dead for every query in the system is not usually due to normal growth of queries, it's usually due to some change somewhere (indexes dropped, massive table data change, new application/massively modified app, broken hardware, overutilised hardware (especially shared stuff like SANs or switches), service pack installation, etc)
p.s. If you do get someone in, don't get the cheapest person available. You want someone good to help, they won't be cheap. If you have a premier support agreement with Microsoft, now is the time to use it, get a PFE in.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
Viewing 9 posts - 16 through 24 (of 24 total)
You must be logged in to reply to this topic. Login to reply