how to stop runaway proc

  • Hi I installed a demo sql server monitor (SQL HeartBeat by SQL SolutionsI think), checked it out and then thought I uninstalled it.

    Unfortunatley I've got a runaway process left over. ITs filling up the error log with

    The activated proc [LakeSideWaitsLogger].[_LakeSide_DbTools_WaitsCPU_Monitor] running on queue msdb.LakeSideWaitsLogger._LakeSide_DbTools_Waits_LogOutQueue output the following: 'Could not find stored procedure 'LakeSideWaitsLogger._LakeSide_DbTools_WaitsCPU_Monitor'.'

    It doesn't have a sql agent job so I couldn't stop that, so I restarted the server but the proc restarted too.

    Apart from filling up the log it has stopped mirroring from restarting.

    I ran

    select * from sys.services

    and discovered

    nameservice_idprincipal_idservice_queue_id

    _LakeSide_DbTools_Waits_LogOutListener655401909246294

    so I guess that I need to remove this from the queue, Is this right, how do I do it safely.

    thanks in advance for your help.

    Martin

  • Sorry if I seem impatient I'm getting worried. Does anyone have any suggestions about my problem.

    Should I drop the queue, or the service?

    Should I stop the queue first? How do I do that? I've tried

    ALTER QUEUE LakeSideWaitsLogger._LakeSide_DbTools_Waits_LogOutQueue WITH STATUS = OFF ;

    but it doesn't seem to stop the messages.

    Is this question in the wrong area?

    thanks

    martin

  • OK, I assume that you do not want/need either the service or the queue, right? They're just a leftover from the demo install?

    If so, then proceed as follows:

    1) DROP the Service. You should always be able to do this.

    2) Create a dummy activation proc (not tested):

    CREATE PROC LakeSideWaitsLogger._LakeSide_DbTools_WaitsCPU_Monitor AS

    SET NOCOUNT ON --does nothing

    If there is a backlog of alerts from the queue, this should silence them.

    3) Drop the QUEUE.

    4) Drop the proc.

    If for some reason this does not work, then you want to ALTER *both* the queue STATUs and the queue's activation status to OFF.

    If you still need more help then I recommend trying Rusanu.com/blog, as I am supposed to be leaving for vacation right now and my wife is staring evilly at me...

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Thanks Barry, I sort of worked out most of that, except for creating a new proc. Enjoy your vacation.

    The messages have stopped by the service/queue has about 20M messages. Its in msdb which is now about 60GB. is it safe to delete this service, I guess it will take a long time and create a large log.

    is there a truncate option I can use?

    Martin

  • Dropping a service should always be possible (though possibly there might be a way to lock it up with schemabinding, not sure if that applies to SB services).

    DROPPING the queue should be the same as dropping a table, which is effectivly the same (data-wise) as truncating a table for the log.

    The other alternative would be to write an activation proc to "drain" the queue. It would be like my previous example proc (which drains the activation signals, but not the queue messages), except that you add a RECEIVE command to it to pull a message off the queue. However, if you've already drained the activation signals then you will need to drive it like a normal procedure: RECEIVE in a loop "WHILE select count(*) From YourQueue > 0".

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Barry,

    I couldn't drop the queue because it was bound to the service.

    I dropped the service but it seemed to be logging the op in the tran log. I cancelled the drop when the tlog got up to 10GB.

    Plan B was your idea of using the receive command. its fairly resource intensive and hits disk usage so a lot so I'm using receive top (20000) ... and looping through that with "go 20". This loop runs for about 3 minutes each time. That way I'm not hitting the system too much and I can cancel out without having to roll back many deletes if I lock too many users out.

    thanks

    Martin

  • Martin Walter (7/12/2009)


    Barry,

    I couldn't drop the queue because it was bound to the service.

    Yep, that's why you have to drop the service first.

    I dropped the service but it seemed to be logging the op in the tran log. I cancelled the drop when the tlog got up to 10GB.

    Now that's something I've never heard of before. Can you show us some of the log entries that led you to that conclusion? Or alternatively, what were the symptoms that made you think that.

    What may be happening is that something is still flooding the service and it went into infinite error mode when you tried to drop the service.

    Plan B was your idea of using the receive command. its fairly resource intensive and hits disk usage so a lot so I'm using receive top (20000) ... and looping through that with "go 20". This loop runs for about 3 minutes each time. That way I'm not hitting the system too much and I can cancel out without having to roll back many deletes if I lock too many users out.

    Good Idea! I should have thought of that (sorry)...

    I'm kind of puzzled that it sounds like this is still going on...? Is the queue getting filled-up again? If so, then something must be feeding it and ultimately you will need to find out what that is.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Barry

    RBarryYoung (7/13/2009)


    Now that's something I've never heard of before. Can you show us some of the log entries that led you to that conclusion? Or alternatively, what were the symptoms that made you think that.

    The symptom was the log file grow rapidly while the drop was running and stopped when I cancelled the drop. The file grew to about 20GB.

    I'm kind of puzzled that it sounds like this is still going on...? Is the queue getting filled-up again? If so, then something must be feeding it and ultimately you will need to find out what that is.

    That wouldn't surprise me. Here's what I've done and some query results that I hope will help you identify what's going on:

    - there are no error log entries for the queue anymore

    - theres nothing in sys.transmission_queue

    - to allow me to run the RECEIVE statement I ran

    ALTER QUEUE LakeSideWaitsLogger._LakeSide_DbTools_Waits_LogOutQueue

    WITH ACTIVATION (STATUS = ON, PROCEDURE_NAME = LakeSideWaitsLogger._LakeSide_DbTools_WaitsCPU_Monitor ,EXECUTE AS SELF ) ;

    ALTER QUEUE LakeSideWaitsLogger._LakeSide_DbTools_Waits_LogOutQueue WITH STATUS = ON;

    - I then run

    receive top (20000) * from LakeSideWaitsLogger._LakeSide_DbTools_Waits_LogOutQueue

    - and after that I run this

    select top (2) object_name(id),[name],indid,reserved from sysindexes order by reserved desc

    which gives me

    (No column name)nameindidreserved

    queue_messages_909246294queue_clustered_index18398161

    queue_messages_909246294queue_secondary_index2181844

    this gets smaller (slowly) after I run the receive statement, but it also grew overnight - hence I think the queue is being used.

    -

    - heres the object info on the queue object

    select *

    from sysobjects

    where name = 'queue_messages_909246294'

    this gives (sorry for the lack of formatting, if you can tell me how to make it legible I will)

    nameidxtypeuidinfostatusbase_schema_verreplinfoparent_objcrdateftcatidschema_verstats_schema_vertypeuserstatsysstatindexdelrefdateversiondeltriginstrigupdtrigseltrigcategorycache

    queue_messages_909246294925246351IT400009092462942008-12-23 09:03:54.307000IT0002008-12-23 09:03:54.3070000020

    thanks for you help

    Martin

  • What do you get from "SELECT Count(*) From YourQueue"?

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • RBarryYoung (7/13/2009)


    What do you get from "SELECT Count(*) From YourQueue"?

    I got a long wait and then 26,000,000.

    I ran it after a receive statement and then got 25.6M.

    I also ran a select * from sys.conversation_endpoints which showed that one conversation had a state of conversing and the service id, conversation id matched the data in YourQueue.

    thanks

    Martin

  • The only thing that I can figure then is that dropping the Service must also be either deleting or updating the ~25M messages in the queue, which is why the Log got hammered.

    You best bet at this point is probably to continue with you current strategy of RECEIVE TOP 20000, with GO 20 every few minutes. If you haven't already done so, you could set it up in an Agent Job to run every X minutes until it's done. IF you set X to 20 minutes, then it should be done in about a day.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • RBarryYoung (7/14/2009)


    The only thing that I can figure then is that dropping the Service must also be either deleting or updating the ~25M messages in the queue, which is why the Log got hammered.

    I think that's right.

    You best bet at this point is probably to continue with you current strategy of RECEIVE TOP 20000, with GO 20 every few minutes. If you haven't already done so, you could set it up in an Agent Job to run every X minutes until it's done. IF you set X to 20 minutes, then it should be done in about a day.

    good idea, I'll do that.

    thanks for your help.

    martin

  • Tx, saved my butt here too!

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply