UnKillable Locks

  • Our production server has run into an issue 3 times now and I am running out of ideas. The problem is with one particular table that a process will lock up and never let go. I am unable to Kill the process and a reboot seems to be the only cure. I read this article http://support.microsoft.com/default.aspx?scid=kb;EN-US;q171224 and it talks about SQL 6.0 and 6.5 but I wonder if the article applies to 2k. The lock that causes all the problems seems to be coming from one particular application when it updates one particular table. I cannot find any anomalies in the way the connections are created or destroyed. It seems to be random. I wonder if deleting and rebuilding the table will help? Any ideas or suggestions?

    Thanks,

    Chris



    - Vega

  • Not sure that will help. Is the database in 2k mode or 65 mode?

    I suspect that the application is doing something funny. Can you trace it and replay the trace and cause the lock?

    Steve Jones

    sjones@sqlservercentral.com

    http://www.sqlservercentral.com/columnists/sjones

  • I have not run a trace against it because the rate of failure is very small, like every 500 times or more it will lock up. The user and host server are constantly connecting up to SQL to run queries (it's coming from our intranet) I'm not sure if I could filter it down enough to be useful. The query that is run everytime it locks is a simple update query to an employee table. I've run the same query numerous times after the server reboots and it works fine. Currently I'm getting the query by double clicking on the locked up process and copying it from there.



    - Vega

  • Crazy. Got a couple bad bugs here myself. Not sure what to tell you. IS the issue an insert/update/delete or a select?

    Steve Jones

    sjones@sqlservercentral.com

    http://www.sqlservercentral.com/columnists/sjones

  • It only seems to be happening on updates. I've checked out a few other forums and this problem seems to happen from time to time and the only solution I've seen yet is to reboot the server. I think I'll try to rebuild the table and indexes, then repopulate the data. It can't hurt anyways.



    - Vega

  • Even I faced the same problem. I had one application running smoothly in 7.0 but when I upgraded to 2k it started getting blocked. I resolved it by improving the Hardware performance. I guess your update is utilzing some process which utilize the system resources heavily, so system is not able to comeout of the process at the same time a request is made for the above said system resource.

    Alternate solution is set the Querytime out property for the SQL Server. This will resolve your problem of blocking by skipping the query which takes more time(i.e the blocking senareo you are facing due to update), but its a temporary solution.


    Manick
    Database Administrator

  • Even I faced the same problem. I had one application running smoothly in 7.0 but when I upgraded to 2k it started getting blocked. I resolved it by improving the Hardware performance. I guess your update is utilzing some process which utilize the system resources heavily, so system is not able to comeout of the process at the same time a request is made for the above said system resource.

    Alternate solution is set the Querytime out property for the SQL Server. This will resolve your problem of blocking by skipping the query which takes more time(i.e the blocking senareo you are facing due to update), but its a temporary solution.

    Manick

    Database Administrator


    Manick
    Database Administrator

  • I would hate to think that this is a hardware issue. We're running with 2 700 Zeon’s, 4 gig ram and about 400 gig HD. Also using RAID 0+1 with a Fiber controller for each raid set. But you did get me thinking, a few months ago we lost 2 gigs of ram, and we've been having this problem intermittently since then. We replaced the RAM, but maybe something got corrupted. The biggest thing that bothers me is the fact that I cannot KILL the process. I'll keep investigating.



    - Vega

  • We had a similar problem and I had to come up with a 'fix' without changing the

    stored procedures or hardware cos it was

    all thrid party. I wrote a script that runs

    once a minute and checks for blocking processes if it finds one it will 'note' it.

    If the process is still blocking after one minute it will netsend a message to a particular user telling them to log off etc.

    It will continue giving warnings each minute for about 4 mins before it gives a last warning telling the user is about to be 'killed'.

    The program logs the actions that it has taken - Who, when etc onto a table.

    Once it gives a warning it dynamically creates a trace and starts to trace the actions of that particular SPID which it saves to a file.

    Kinda rough I know but I did not have a lot of time.

    You are welcome to the code - Everything is configurable, if you know where to look.

    It has saved me no end of work and has been working non-stop for about a year - Don't

    ask why the problem has not been fixed!

    I can let you have a copy of the modules if you think it will help.

  • I'll try anything, once... But I'm wondering if this program yo uhave will be able to KILL the lock. I've tried everything and nothing, except rebooting, will kill it. I'm wondering though if it will KILL if I catch it before the buffers fill up.



    - Vega

  • Why don't you specified in the aplication the lock_timeout setting. Remember that you must handle the error in the aplication.

  • I have seen a similar problem here where I work. In our case, we had a stored procedure that was instantiating a .dll (via sp_OACreate) and then executing it. Periodically, the thread it was using would be switched and the return from the .dll would be "lost" in a manner of speaking. At this point, the SQL SPID would hang in EXECUTE mode but would never recieve a reply from the process it was running. It was unKILLable. The only way to kill it was to restart SQL/Server (this was a better solution than rebooting the server, as the recovery time was faster since all the other server processes didn't have to stop and restart). The instances of this happening went WAY down when we turned off lightweight thread pooling (the "Use NT fibers" option) in SQL/Server. According to Microsoft, when using lightweight pooling, SQL can switch threads without notifying the O/S. In any case, it happens very infrequently now; still, when it does, I still have to stop and restart SQL/Server to kill that process. (Wouldn't it be cool to have some sort of sp_lock_release proc that removed the locking information for those tables from syslocks? I've thought of trying to experiment with doing that myself, but I know darn well it would probably mess up a lot of other tables).

  • I make a point of having "lightweight thread pooling / NT fibers" turned off for exactly this reason, but if I ran into this on a server I absolutely could not re-boot, I'd try using HandleEx (from http://www.sysinternals.com).

    But only do this if you're very sure you know what you're doing with Win2K processes and such ... you can easily blow up a server ...

    BUT I have avoided having to reboot a server a few times by using it to remove a lock on a file or some such ...

  • Hey Vampire,

    I'd be interested in looking at that code! You could either eMail it to cstrong@kingston-technology.com

    Or better still, maybe you could post it as a script for everyone to use?

    Clive Strong

    clivestrong@btinternet.com

  • I would love copy of this code too if you are still sharing it as I have a similar problem

    cheers -

    cathythomas73@hotmail.com

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply