Killed SPID on xp_LogInfo not rolling back and growing TempDB

  • Hi Everyone,

    Looking for a bit of advice and what is a strange issue and the first time i have come across something like this,

    I received a notification that the space on the drive my tempDB sits on was running low, when checking the running process it pointed to a server agent task,

    i checked and we had one which had been running for over a day, usually runs for less than a minute, and it was showing a status of "Executing: 0(Unknown)" in the job activity monitor,

    When i tried to stop the Job in Activity Monitor nothing happened it just kept on running and filling up TempDB to almost the full size of the disc, knowing that the job itself could just be stopped and re-run without affecting the business i, perhaps too hastedly, killed the SPID thinking this would "release" space back to tempDB and i would then restart the services out of hours to shrink the files,

    Unfortunately this didnt work and the spid has been in a Killed/rollback state for nearly a day now whilst the tempDB files grow (i arranged for the disk they sit on to be increased whilst i looked into it),

    When i check the process for the SPID it appears to be running xp_LogInfo with the last batch relating to " insert #nt select distinct domain+N'\'+name, sid, sidtype from OpenRowset(TABLE NETUSERGETGROUPS, @acctnam"

    I believe that the owner of the SQL Job may have reset their password when the job kicked in to cause this (this is just me guessing) which has caused it to get stuck in a loop?

    I am wondering if restarting the Server Agent Service may force a stop on the Job and release this but i am bit weary to start playing with the services,

    Does any body have any ideas?

    Many thanks in advance

    Shayn

    ---------------------------------------------------------------------------------------
    The more you know, the more you know that you dont know

  • ..also, the CPU and MemUsage in sysProcesses hasn't changed for quite some time with the LastWaitType being PREEMPTIVE_OS_AUTHORIZATIONOPS

    ---------------------------------------------------------------------------------------
    The more you know, the more you know that you dont know

  • This is a tough one. Haven't heard or come across this.

    What if you stop and start sql server agent services? Sounds like you have nothing to loose

    Can you fail over to secondary?

  • Hi Cebisa,

    Thanks for your reply,

    Unfortunately i don't have a secondary to fail over to,

    Yeah i am leaning towards a restart of the Server Agent Services but whilst it isn't, yet, causing me any front end issues i was wondering if there was anything else that could be done just in case the restart doesn't work,

    Regards

    Shayn

    ---------------------------------------------------------------------------------------
    The more you know, the more you know that you dont know

  • What else is going on?

    Xp_loginfo should not be causing such an issue.

    Any blocking? Sp_who2, sp_who2 'active', dbcc inputbuffer(spid)

  • i may be going at this from the wrong angle but the only issues that are being presented are a job stuck "Executing: 0(Unknown)" and a constantly growing TempDB, no blocking is occurring,

    If i run DBCC INPUTBUFFER for the SPID i killed for this it returns the below

    EventType - Language Event

    Parameters - 0

    EventInfo - (@P1 nvarchar(128),@P2 uniqueidentifier)EXECUTE msdb.dbo.sp_sqlagent_has_server_access @login_name = @P1, @job_id = @P2

    Would this also point to an issue with the account of the user of the SQL job?, he changed his NT password at around the time the job was kicking in

    Thanks

    Shayn

    ---------------------------------------------------------------------------------------
    The more you know, the more you know that you dont know

  • also, just to confirm, it is the TempDB log file that is growing not the data file,

    not sure if that makes any difference?

    ---------------------------------------------------------------------------------------
    The more you know, the more you know that you dont know

  • restarting the server agent service has not had the desired effect,

    the sql job has now stopped but the SPID remains in a KILLED/ROLLBACK state and the tempdb log file continues to grow,

    does anybody have anymore ideas please,

    any help is greatly appreciated

    ---------------------------------------------------------------------------------------
    The more you know, the more you know that you dont know

  • when running DBCC OPENTRAN(TEMPDB) i get the below, the spid it relates to is the one i killed

    Transaction information for database 'tempdb'.

    Oldest active transaction:

    SPID (server process ID): 205

    UID (user ID) : -1

    Name : INSERT EXEC

    LSN : (5293:26478:20)

    Start time : Oct 19 2020 3:30:02:193PM

    SID : 0x0105000000000005150000005f9b10c13b99fcfc4ad890fd8e220200

    DBCC execution completed. If DBCC printed error messages, contact your system administrator.

    Completion time: 2020-10-22T19:56:38.4584380+01:00

    ---------------------------------------------------------------------------------------
    The more you know, the more you know that you dont know

  • Pretty sure you are just stuck waiting for it to finish rolling back.

    Basically, if the query made changes to anything, SQL needs to undo those changes.

    I would check the results of the following query:

    SELECT
    [session_id]
    , [percent_complete]
    , [estimated_completion_time]
    , [last_wait_type]
    , [wait_resource]
    , [wait_time]
    , [wait_type]
    FROM[sys].[dm_exec_requests]
    WHERE[status] = 'ROLLBACK';

    (query obtained from http://www.sqlservice.se/check-progress-of-rollback/).  That second column "Percent_Complete" will tell you how far the rollback is complete if SQL can tell, and the "estimated_completion_time" will give you a rough idea of how long SQL thinks it will take to roll things back.

    Apart from restarting the SQL instance, I am pretty sure all you can do is wait for it to finish.  If it is not causing any business problems, I would check the above query and try waiting rather than restarting the SQL Service if it was me.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • Hi Brian,

    Thanks for your response,

    I ran the query but it didn't return anything,

    When i ommited the status = 'Rollback' line i can see that the session_id/SPID shows the bellow

    Session_ID - 205

    percent_complete - 0

    estimated_completion_time - 0

    last_wait_type - PREEMPTIVE_OS_AUTHORIZATIONOPS

    wait resource -

    wait_time - 279093770

    wait_type - PREEMPTIVE_OS_AUTHORIZATIONOPS

    status - running

    whilst it isn't causing any front end issues it its constantly growing the tempdb log file and i will be at disk capacity before too long

    any ideas on what else i can check?

    i am pretty sure that the process was running for 43 hours before i killed it, it has been in a killed/rollback state for 34 hours

     

    ---------------------------------------------------------------------------------------
    The more you know, the more you know that you dont know

  • Just a shot in then dark (which I can help with), does anything in the job make a call to xp_CmdShell?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hi Jeff,

    Thanks for your response,

    No the job itself was just running a stored proc which called some other procs, some of which being on linked servers,

    It didnt seem to even get to the first line though, it was showing "Executing: 0(Unknown)" in the job activity monitor,

    Since restarting the SQL Server Agent service the job has now stopped, and been disabled, but the tempdb continues to grow

    Regards

    Shayn

    ---------------------------------------------------------------------------------------
    The more you know, the more you know that you dont know

  • Since the query didn't return anything, my thought is the rollback completed successfully. Where are you seeing the "killed/rollback" information?

     

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • Hi Brian,

    I am seeing that as the CMD column in sys.sysprocesses for the SPID,

    Also in sp_Who2

    Thanks

    Shayn

    ---------------------------------------------------------------------------------------
    The more you know, the more you know that you dont know

Viewing 15 posts - 1 through 15 (of 22 total)

You must be logged in to reply to this topic. Login to reply