Corrupted DB after transaction log full

Question

Corrupted DB after transaction log full

Vedran Kesegic

SSCrazy Eights

Points: 9302
More actions
September 30, 2012 at 3:56 pm

#282363

One-time DB job was scheduled at night to add two columns (computed, persistent) on a large table (5 GB together with indexes, witout compression about 10GB).
It is a compressed heap with several nonclustered indexes also compressed, partitioned by month with aligned partitions.
The command in the job was just "ALTER TABLE xy ..." that adds two columns.
The transaction lasted for several hours, and failed.
After that, the database was online, but most of users users got very odd errors every time:
Error: 615, Severity: 21, State: 1
Could not find database ID 6144, name '6144'. The database may be offline. Wait a few minutes and try again.
Some users could use the db normally, but most of the users got hundreds of that message, sometimes with different number as the "name" of the database. Such numeric names of DB of course do not exist and have never existed on the server.
Totally crazy!
DB was online and DBCC CHECKDB found no errors.
But almost none of the users could use the DB, so I had to fix it somehow.
Error log for the job reported this set of errors:
TCP Provider: An existing connection was forcibly closed by the remote host.
[SQLSTATE 08S01] (Error 10054)
Communication link failure [SQLSTATE 08S01] (Error 10054)
The transaction log for database '*DBNAME*' is full.
To find out why space in the log cannot be reused, see the log_reuse_wait_desc column in sys.databases
[SQLSTATE 42000] (Error 9002)
The log for database '*DBNAME*' is not available.
Check the event log for related error messages.
Resolve any errors and restart the database.
[SQLSTATE HY000] (Error 9001)
During undoing of a logged operation in database '*DBNAME*', an error occurred at log record ID (388242:11656:483).
Typically, the specific failure is logged previously as an error in the Windows Event Log service.
Restore the database or file from a backup, or repair the database.
[SQLSTATE HY000] (Error 3314)
During a job execution, a Transaction log grew from 6 to 36GB. That is still not enough to fill-up the disc space, but not far from it (several GB were free).
So, I wonder why the "Transaction log full" error ?
After "transaction log full" striked, a huge rollback was performed. But that rollback also failed.
I ended up in a corrupted DB (although DBCC CHECKDB says it is ok, it is NOT) that is unrecoverable, except restoring it from the backup.
So, I did restore it to the point in time just before that ALTER TABLE started. Losing all the data changes and inputs after that point.
Now users can use it and looks fine, but I'm afraid that it can happen again if I don't find a reason why this severe corruption occurred.
So, please help. I have to find out. Because, I can't believe it is normal that one gets a corrupted db if transaction log fills the disk (and I don't believe that disk was totally full).
The point is - a simple ALTER TABLE ADD COLUMN on 5GB table generates 30+GB of transaction log, and unrecoverable corruption of the database.
So, I wont run that command again until I find the cause of corruption.
Is it really common in SQL server if you run out of disk space that you get corrupted DB ?
Version is SQL 2008R2 RTM on a 2-node cluster.
_____________________________________________________
Microsoft Certified Master: SQL Server 2008
XDetails Addin - for SQL Developers
blog.sqlxdetails.com - Transaction log myths
October 15, 2012 at 12:08 am

This was removed by the editor as SPAM

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 1

http://connect.microsoft.com/SQLServer/feedback/details/684806/rollback-causes-db-to-go-suspect-only-repairable-with-repair-allow-data-loss

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Vedran Kesegic SSCrazy Eights Points: 9302 More actions · Answer 2

Thanks for the link. But I'm not sure it is the same case.

My database was not in a suspect state, it was online.

And DBCC CHECKDB did not find any errors.

Error occured in a transaction log record, and after rollback failed, db started to spit "Could not find database name '5463463456346'" errors when calling tsql that normally works.

_____________________________________________________
Microsoft Certified Master: SQL Server 2008
XDetails Addin - for SQL Developers
blog.sqlxdetails.com - Transaction log myths

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 3

Call CSS. Sounds like you hit a rather strange bug.

Is it reproducible?

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Vedran Kesegic SSCrazy Eights Points: 9302 More actions · Answer 4

After investigation, I found the drive was full indeed at the time of error, but in the morning when I looked at it it was not full, so that explains "Transaction log full" error.

But why does "Transaction log full" results with corruption of DB data?

Is that expected behavior ? (shouldn't be)

_____________________________________________________
Microsoft Certified Master: SQL Server 2008
XDetails Addin - for SQL Developers
blog.sqlxdetails.com - Transaction log myths

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 5

It shouldn't, I've never heard of it happening before. If it has, it's a bug and needs to be reported to CSS, preferably with a reproduction.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

NehwalSaina12 SSC Rookie Points: 39 More actions · Answer 6

The MDF files of SQL database may get corrupt due to some problem in the database. If such a situation arises, the user can use the software which recovers the lost MDF file. But before choosing the software, make sure that it is efficient enough to recover the unique keys, primary keys, indexes, stored procedures, views etc from your corrupt MDF file.

BLOB EATER SSChampion Points: 14609 More actions · Answer 7

NehwalSaina12 (10/8/2012)
The MDF files of SQL database may get corrupt due to some problem in the database. If such a situation arises, the user can use the software which recovers the lost MDF file. But before choosing the software, make sure that it is efficient enough to recover the unique keys, primary keys, indexes, stored procedures, views etc from your corrupt MDF file.

MDF repair software?! Are they actually any good?

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 8

a.SQL.com (10/8/2012)
NehwalSaina12 (10/8/2012)
The MDF files of SQL database may get corrupt due to some problem in the database. If such a situation arises, the user can use the software which recovers the lost MDF file. But before choosing the software, make sure that it is efficient enough to recover the unique keys, primary keys, indexes, stored procedures, views etc from your corrupt MDF file.
MDF repair software?! Are they actually any good?

Most aren't. Buyer beware in this area.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass