DBCC Checkdb consistency errors

  • Hello All,

    After restoring the SQL Server 2000 user database backup into one of the SQL Server 2000 instances I get database consistency errors. I restored this database a couple of times and every time I get the errors; however they are not consistent - different types of errors, different tables and indexes get affected. The first time this happened I rebuilt the indexes and it resolved the errors. This time rebuilding the indexes didn't help. I tried all dbcc checkdb options and finally used the last one -repair_allow_data_loss.

    I restored the same database backup into a different server and ran dbcc checkdb over it - no errors occurred. Something causes corruption in the database on one particular server.

    Here are the errors:

    DBCC results for 'tTREE_COMPONENTS'.

    Server: Msg 8928, Level 16, State 1, Line 1

    Object ID 1467308437, index ID 0: Page (1:251481) could not be processed. See other errors for details.

    Server: Msg 8944, Level 16, State 1, Line 1

    Table error: Object ID 1467308437, index ID 0, page (1:251481), row 44. Test (ColumnOffsets <= (nextRec - pRec)) failed. Values are 87 and 57.

    There are 663 rows in 16 pages for object 'tTREE_COMPONENTS'.

    DBCC results for 'tACCUMULATORS'.

    There are 10234724 rows in 97149 pages for object 'tACCUMULATORS'.

    CHECKDB found 0 allocation errors and 2 consistency errors in table 'tACCUMULATORS' (object ID 1467308437).

    DBCC results for 'WATSTARTDATETP'.

    Server: Msg 8928, Level 16, State 1, Line 1

    Object ID 1905493917, index ID 4: Page (3:224866) could not be processed. See other errors for details.

    Server: Msg 8941, Level 16, State 1, Line 1

    Table error: Object ID 1905493917, index ID 4, page (3:224866). Test (sorted .offset >= PAGEHEADSIZE) failed. Slot 159, offset 0x1 is invalid.

    Server: Msg 8942, Level 16, State 1, Line 1

    Table error: Object ID 1905493917, index ID 4, page (3:224866). Test (sorted.offset >= max) failed. Slot 0, offset 0x9f overlaps with the prior row.

    There are 4 rows in 1 pages for object 'WATSTARTDATETP'.

    DBCC results for 'ACCRUALTRAN'.

    There are 5919148 rows in 150538 pages for object 'ACCRUALTRAN'.

    CHECKDB found 0 allocation errors and 3 consistency errors in table 'ACCRUALTRAN' (object ID 1905493917).

    How would you determine the cause of the database corruption in SQL Server 2000 environment?

    If you could please take a look at the errors and give me your thoughts it would be greatly appreciated.

    Thank you in advance,

    Yulia

  • Sounds like the hardware on the first server is dodgy in some way - if repeated restores on that server result in corruptions but restores on other servers don't, then something in the IO subsystem on that server is broken. I would run SQLIOSim on the server to see if it can flush out any problems.

    Thanks

    Paul Randal
    CEO, SQLskills.com: Check out SQLskills online training!
    Blog:www.SQLskills.com/blogs/paul Twitter: @PaulRandal
    SQL MVP, Microsoft RD, Contributing Editor of TechNet Magazine
    Author of DBCC CHECKDB/repair (and other Storage Engine) code of SQL Server 2005

  • Paul,

    Thank you so much for your advice! I asked our server administrators to run Dell diagnostics but in the meantime I will run SQLIOSim independently.

    I know that you rewrote dbcc checkdb and you are the guru in the database repair. I ran the consistency checker several times over this “problem” SQL Server database and I was surprised by the results. DBCC checkdb with the repair_allow_data_loss option repaired 9 errors; however when I ran dbcc checkdb (dbname) 18 minutes later it detects 714 errors. Does it make sense for you?

    2008-04-25 10:51:08.27 spid53 DBCC CHECKDB (tkcsdb) executed by yfuller found 5 errors and repaired 0 errors.

    2008-04-25 11:33:27.64 spid54 DBCC CHECKDB (tkcsdb, repair_fast) executed by yfuller found 9 errors and repaired 0 errors.

    2008-04-25 12:02:42.90 spid54 DBCC CHECKDB (tkcsdb, repair_allow_data_loss) executed by yfuller found 9 errors and repaired 9 errors.

    2008-04-25 12:20:25.34 spid54 DBCC CHECKDB (tkcsdb) executed by yfuller found 714 errors and repaired 0 errors.

    2008-04-25 12:55:28.81 spid54 DBCC CHECKDB (tkcsdb, repair_fast) executed by yfuller found 714 errors and repaired 714 errors.

    2008-04-25 13:07:32.21 spid54 DBCC CHECKDB (tkcsdb) executed by yfuller found 0 errors and repaired 0 errors.

    Sincerely,

    Yulia

  • Repair is a bad idea. As Paul gave a talk last year at TechEd and walked through what happens. Basically they have no idea how to fix things and they start making guesses. He didn't recommend the allow_data_loss.

    You''d be better off calling PSS and having them help you get the data out and rebuild the table.

    At this point, I'd run some rowcounts and try to determine if you've lost data and perhaps just call PSS. If you have Technet or MSDN, you have some free incidents. If not, buy TechNet ($349) and you get two incidents (normally $249)

  • Well, Steve - it's not that bad. We know how to fix all errors that are detected - there's no guessing involved in repair. What you're thinking of is the online checking algorithm in SS2000 that in a couple of pathalogical cases had to guess as to some log record meanings - *never* in repair.

    Yes, the behavior you're seeing is consistent with my experience sometimes. A corruption may be 'hiding' other corruptions by preventing deeper checking algorithms from running - once the repair is fixed, as subsequent run of CHECKDB may find more errors bevcause other algorithms are now able to run.

    I'd never run repairs without restoring backups first - you will have lost data from your database by doing so (REPAIR_ALLOW_DATA_LOSS is very aptly named).

    Thanks

    Paul Randal
    CEO, SQLskills.com: Check out SQLskills online training!
    Blog:www.SQLskills.com/blogs/paul Twitter: @PaulRandal
    SQL MVP, Microsoft RD, Contributing Editor of TechNet Magazine
    Author of DBCC CHECKDB/repair (and other Storage Engine) code of SQL Server 2005

  • Paul and Steve,

    First of all, thank you so much for your professional advice! I really appreciate it!

    I just want to bring to your attention that I wouldn't do anything like that in the production environment. I was testing the backup-restore for one of our prod databases on the test servers. The restores on one particular test server resulted in corruption but the restores on other servers didn't - I was testing the same database backup file.

    I kept restoring the backup to this "problematic" test server and I ran different dbcc checkdb options over it and documented the errors.

    Per Paul's advice, I am currently running the SQLIOSim on the server with the database corruption.

    Thank you,

    Sincerely,

    Yulia

  • The first, best solution if DBCC CHECKDB reports consistency errors is to restore from a known good backup. However, if you cannot restore from a backup, then CHECKDB provides a feature to repair errors. If system level problems such as the file system or hardware may be causing these problems, it is recommended you correct these first before restoring or running repair.

    When you run DBCC CHECKDB a recommendation is provided to indicate what the minimum repair option that is required to repair all errors. These messages may look something like the following:

    CHECKDB found 0 allocation errors and 15 consistency errors in database 'mydb'. repair_allow_data_loss is the minimum repair level for the errors found by DBCC CHECKDB (mydb)

    SSMS Expert

  • Please note: 6 year old thread (and last post fairly irrelevant anyway)

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply