Help, my database is corrupt. Now what?

  • Could you start a new thread for your corruption problem please?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Sorry - I probably asked my question in the wrong way.

    What I really wanted to know was whether you would proceed differently (i.e., more cautiously) knowing the corruption affected a system table as opposed to a user table. Or is a table is a table is a table?

    I wasn't really looking for a solution to my specific corruption problem. I've already written and tested a fix script (thanks to advice in your article and from Paul Randal's on SQLSkills.com).

    I do see now that a lot of people have posted specific issues in this forum, and I will be careful to avoid that!

    Thanks,

    Doug

  • doug.baker-706021 (12/11/2012)


    What I really wanted to know was whether you would proceed differently (i.e., more cautiously) knowing the corruption affected a system table as opposed to a user table.

    Generally, yes. A system table cannot be repaired, cannot have indexes rebuilt. Typically corruption in a system table means restore from backup with few to no other options available.

    Since the error you have isn't really corruption (just bad metadata), perhaps it's fixable. Not an error I've encountered in a system table before.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Thank you, Gail. I appreciate the additional information.

    Fixing this particular system table in this particular case appears to have had no adverse affects in the test environment, so I think your comment about it not really being corruption hits the point here. I think now I will go get my copy of SQL Server 2008 Internals and read the DBCC chapter by Paul Randal you mentioned in the footnote of your article, and ponder this further.

    --Doug

  • I had a disk array go nuts recently and it corrupted MSDB and a an application DB. I could repair MSDB, and we had Full and Log backups for the application DB. Whew! We survived.

    BUT! I have a question... I’ll do this in an outline so maybe it will be easier to follow:

    1.Let's say that you have a application DB that gets fully backed up at 1:00am

    2.It has transaction log backups every 15 minutes

    3.The DB gets corrupted at, say, 6pm.

    4.The corruption results in the loss of all the records in a user table

    5.At 6:30pm, the application actually pulls off inserting records into the table

    6.Your checkdb does not run until 8:00pm.

    Is there a chance that the transaction log backup will be funky because of inserts (etc) into the corrupted table? To be specific, if there are no records in the table because of the corruption, that could result in inserts that would not have normally happened if there were records in the table. So, when you apply the log backup, it would try to insert duplicate records?

    I am sure that I am ignorant of something (like something in the nature of log backups), but I have to ask!

    Thanks,

    M

  • Yes, potentially.

    You'd have to do the following:

    - extract all inserts performed after the table was emptied

    - restore the database to time just before the corruption occurred

    - manually merge in the post-corruption inserts, taking care of duplicates etc

    Tedious.

    However, how would the corruption have removed all records in the table but still allow inserts? By 'corruption', do you mean 'someone accidentally deleted all the records'? That's the only way your scenario can occur IMHO.

    Thanks

    Paul Randal
    CEO, SQLskills.com: Check out SQLskills online training!
    Blog:www.SQLskills.com/blogs/paul Twitter: @PaulRandal
    SQL MVP, Microsoft RD, Contributing Editor of TechNet Magazine
    Author of DBCC CHECKDB/repair (and other Storage Engine) code of SQL Server 2005

  • Thanks Paul. That is what I thought.

    The idea that the table was actually corrupted (not by a user- but say a disk array error) and it still allowed inserts is the paranoid DBA in me talking! But just because you are paranoid, doesn’t mean... 😉

    I have just seen DBs get corrupted, and they function just “fine”- But most likely not this sort of scenario.

    So you think that if a table had issues, SQL would not allow the inserts? If so, that is good. I would love a crash at that point instead of a silent failure.

    What is bothering me is that SQL does not tell you right away that there is DB corruption. I run CheckDB once a day on all the production servers, but I would much rather know in 30 seconds!

    What tool would you use to read the logs and merge the data back together in my hypothetical case?

    Again Thank you! This site has saved our bacon more than once.

    M

  • You wouldn't read the logs. You'd have the corrupt database, and the restore pre-corruption database and then you'd manually merge the data in the two tables.

    Paul Randal
    CEO, SQLskills.com: Check out SQLskills online training!
    Blog:www.SQLskills.com/blogs/paul Twitter: @PaulRandal
    SQL MVP, Microsoft RD, Contributing Editor of TechNet Magazine
    Author of DBCC CHECKDB/repair (and other Storage Engine) code of SQL Server 2005

  • michael merrill (12/28/2012)


    The idea that the table was actually corrupted (not by a user- but say a disk array error) and it still allowed inserts is the paranoid DBA in me talking! But just because you are paranoid, doesn’t mean... 😉

    Depending on how badly and what has been corrupted, the inserts may run fine or they may fail. If they run fine then the inserts will be logged normally (and hence present in log backups), if they fail, you get blatant error messages

    What is bothering me is that SQL does not tell you right away that there is DB corruption.

    Sure it does. The instant that any query encounters corruption you get an error in the error log (823 or 824 are the more common). However, if you don't read the corrupt page, SQL has no way of intuiting that a page on disk that it has not read has been damaged by the IO subsystem.

    CheckDB, because it reads every page in the DB, will find all corrupt pages. Normal queries running against the DB might not use the pages that are damaged and hence will never notice that they're corrupt

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Thanks bunches Gail, I inherited a mess and one of your sections is going to help me recover a DB. 😀

    Drive it like you stole it[/size] :w00t:

  • Make sure you check the database id when getting a corruption message. I sometimes see the corruption message in temdb - database 2.

    Server: Msg 605, Level 21, State 3, Attempt to fetch logical page (1:39296) in database 2 failed. It belongs to allocation unit 72153958174228480 not to 5188243200689831936.

    It turns out, a trace flag was somehow removed (or never added to the service start up) and the need to add back trace flag 4199 back to the server for 2008 R2 SP2 with the command dbcc traceon(4199, -1). This usually happens when someone forgets to put in it the service startup line.

  • Hi,

    I have an "error" message in the sql error log, which after executing dbcc checktable on a table tells me that it "found 1 errors and repaired 0 errors." The dbcc checktable itself ran to completion on the table. There is nothing more in the sql error logs to indicate any corruption. There is nothing in the windows system/application logs either. Dbcchecktable was done with no_infomsgs option. Wondering if this means there is a possible corruption issue with the table? I would really appreciate suggestions. This problem occurred in Production environment and as you can guess, am very keen to get some suggestions

Viewing 12 posts - 76 through 86 (of 86 total)

You must be logged in to reply to this topic. Login to reply