SQL Clone
SQLServerCentral is supported by Redgate
Log in  ::  Register  ::  Not logged in

Confessions of a DBA - Part 1

Couple of year's back I brought the SQL Server down to its pathetic knees by using a HINT. Yes, with just one hint on a stored proc. Before I give specify what the HINT was, let me provide the details of why I thought of even giving a hint.

Our company has a website that has very high traffic. Our normal bandwidth usage is 8 MB per second and it can peak upto 15 MB per second. Currently we do 13 K batch request per second. During the time this happened, we were doing 5K batch request per second.

While monitoring the activity on the server, I saw a strange behavior for one of the stored procs that was heavily used by the site (around 700 calls per second and we were not using any caching during that time). Usually that particular stored proc was using around 6000 page reads when executing. All of a sudden the number page reads would increase to 20K. The CPU usage goes up by 10%. I was not able to reproduce this behaviour in the QA environment. As soon as I do a sp_recompile for the stored procedure, the page reads would go back to normal, the CPU usage comes down.

My train of thought was what if this happens when I am not here? I was the lone DBA and there was no one else who had access to the SQL Server other than me. Therefore I decided to use a hint of WITH RECOMPILE on the stored proc. This worked fine when I tested in my test environment. Since this was just a stored proc change, it was released without much thought. My trouble started at that time. As soon as I released it I saw that everything started getting sticky. I was able to execute one query (In house who is active stored proc) and it showed thousands of SPIDS trying to execute this stored proc and it was blocking each other. Just imagine 700 stored procs trying to do a recompile in one second.

After that everything hanged. The website went down, all application went down (Time out errors), SQL Query analyzer was not responding (Yes, it was in SQL 2000 now it is 2008). I tried putting back the original stored proc but that was hanging as well. I called up the management and told them what happened and told them that I will need to restart the SQL Server box. They were not thrilled by it but I was thankful that I was not shown the door. The SQL Server was restarted after the website traffic was blocked. I put the original stored proc back in and everything went back to normal.

I learnt from my mistake. I never did that again. But to solve the issue at hand, I created a Job that would do a recompile for that stored proc once every 15 minutes. Therefore the maximum time that the stored proc would misbehave was 15 minutes. The real solution was to rewrite the stored proc. This I did after doing some research on parameter sniffing.


Posted by Jason Brimhall on 8 April 2011

It's nice to see posts like this from time to time.  I started doing a similar thing based on a question asked of me.  I was asked to discuss some of my biggest mistakes.  Personally I prefer to forget those, but it is good to document them and learn from it.

Posted by Roy Ernest on 8 April 2011

You always learn from your mistakes. maybe if others saw the mistakes made by someone, they might learn as well.

Posted by Christoph D on 15 April 2011

A person should learn from mistakes, a smart person even learns from mistakes made by others.

I had the same behaviour on our live-server once but I never thought about this hint even if a recompile made everything behave like normal. We did some parameter-sniffing and afterwards added some join-hints to ensure the server is doing it the way we want it. It is not the smartest idea but on the other hand better one execution with a non-optimal plan than all executions with a bad plan after getting the wrong parameters once.

Has anybody maybe a better solution how to solve this problem?

Posted by Roy Ernest on 15 April 2011

Grant wrote an excellent post about parameter sniffing. Maybe it could help you.


Leave a Comment

Please register or log in to leave a comment.