Killed 5 Processes Table Locks Yesterday, Still in Killed/Rollback Status

Question

Killed 5 Processes Table Locks Yesterday, Still in Killed/Rollback Status

Welsh Corgi

SSC Guru

Points: 116520
More actions
July 7, 2012 at 8:04 am

#272167

I got a e-mail yesterday after hours from a Lead and he told me that he thought they had locks on a table and asked me to investigate.
Sure enough there were locks 5 locks on the same table.
I killed the processes yesterday, 17 hours ago and the processes are still in a rollback status.
I try to kill the processes again the the completion of the rollback is zero percent.
What can I do for it appears that the rollback is not going to complete?
SPID 57: transaction rollback in progress. Estimated rollback completion: 0%. Estimated time remaining: 0 seconds.
SPID 71: transaction rollback in progress. Estimated rollback completion: 0%. Estimated time remaining: 0 seconds.
SPID 69: transaction rollback in progress. Estimated rollback completion: 0%. Estimated time remaining: 0 seconds.
SPID 54: transaction rollback in progress. Estimated rollback completion: 0%. Estimated time remaining: 0 seconds.
SPID 61: transaction rollback in progress. Estimated rollback completion: 0%. Estimated time remaining: 0 seconds.
For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply

Welsh Corgi SSC Guru Points: 116520 More actions · Answer 1

I restarted the SQL Server Service for that instance to get rid of those processes

For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

SQLRNNR SSC Guru Points: 281334 More actions · Answer 2

Welsh Corgi (7/7/2012)
I restarted the SQL Server Service for that instance to get rid of those processes

Unfortunately, as a last resort, that is the only other thing that can be done. First option is to wait it out.

Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events

Welsh Corgi SSC Guru Points: 116520 More actions · Answer 3

I had killed the processes 17 hours before I restarted the Service.

The percent complete of the rollback was 0% on all four processes.

What I found odd was that the Last Batch Dates were one 6/09/2012 and three 6/10/2012.

They were executing Stored Procedures with an OPENQUERY Statement to a remote DB2 Database on an AS400.

For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

ksrikanth77 SSCrazy Points: 2232 More actions · Answer 4

Sometimes restarting the SQL Services will not clear the killed rollback state. Please restart the server. It will get cleared.

Regards

S.R.Kundur

Jeff Moden SSC Guru Points: 1003863 More actions · Answer 5

Welsh Corgi (7/7/2012)
I had killed the processes 17 hours before I restarted the Service.
The percent complete of the rollback was 0% on all four processes.
What I found odd was that the Last Batch Dates were one 6/09/2012 and three 6/10/2012.
They were executing Stored Procedures with an OPENQUERY Statement to a remote DB2 Database on an AS400.

Actually, this is a well know problem to those using OPENQUERY to a remote DB2 database on an AS400. I went through the very same problem just about a week ago. It turns out that there's a "CONNECT" item on the subject and MS closed it with a "Call Customer Support when this happens" answer.

What happens with these things is that the rollbacks were actually successful and the "0%" to go message (in these instances) is actually correct. The problem with the continued attempts at rollbacks is that the SPIDs are actually consuming a whole lot of CPU time. It's like they're stuck in a loop or something. They also keep locks on the database which can sometimes interfere with things such as dropping the database (which I do every night on my test restore system).

Once all other methods to try to overcome this problem have been exhausted, bouncing the service (or even the machine itself) is the only way to kill these things. The problem with that is that not all rollbacks are so kind. If there actually is something to rollback, the rollback (maybe roll forward) will continue during the startup. I've built some code at work that I used to determine if there was actually anything to rollback to see if it was actually safe to bounce the service.

There's one more thing that you have to do after that... alert the DB2 folks because killing the job will leave a long running job open on the DB2 side, as well. It does about the same thing as it does in SQL Server... it spins its wheels eating a lot of CPU to do nothing.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Welsh Corgi SSC Guru Points: 116520 More actions · Answer 6

Jeff,

Thanks for all of the information.

I appreciate it.

Regards...

For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Welsh Corgi SSC Guru Points: 116520 More actions · Answer 7

Thanks Jeff.

I had checked what they were doing. It was actually 5 processes, all performing merge statements on the same table with different criteria on the WHERE Clause.

Their justification for doing this is that it will increase the bandwidth and speed up the load.

I have provided them with an abundance of CPU and memory.

I converted the OPENQUERY statements to OLE DB Source (AS400) and OLE DB Destination (SQL Server) and as expected the initial load to staging completed in a small fraction of the time it took to do an OPENQUERY.

Unfortunately we can't do that with the incremental on the large and they are doing MERGE OPENQUERY.

The problem is that most of the tables do not have Insert, Modified Date on them so they can't just select on the Datetime stamps.

On the smaller tables I converted their logic to a TRUNCATE & Fast Load OLE DB.

Edit: Typo on TRUNCATE

For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

J Livingston SQL SSC Guru Points: 51550 More actions · Answer 8

Welsh Corgi (7/8/2012)
The problem is that most of the tables do not have Insert, Modified Date on them so they can't just select on the Datetime stamps.
On the smaller tables I converted their logic to a TRUNCDATE & Fast Load OLE DB.

...had to do similar on a third pty database.

smaller tables...truncate and reload

and if IIRC ended up paying to get modified date columns on larger (excess 1m rows)..then there was still an on going debate about creating relevant index...hey ho 🙂

________________________________________________________________
you can lead a user to data....but you cannot make them think
and remember....every day is a school day

jonnomaher Valued Member Points: 63 More actions · Answer 9

Would utilising sysinternals TCPView tool to identify the process and terminate it outside of MSSQL not help with these SPIDS? Often with past experience and SQL versions, the SQL process had lost its handle to the operating system PID when this kind of thing had occured.

1) Download TCPView http://technet.microsoft.com/en-us/sysinternals/bb897437

2) Run TCPView, locate the connection to linked server and close the connection.