Msg 8909 & Msg 8998: Page allocation errors

  • I Recently ran dbcc checkdb on one of our SQLServer boxes (SS2005 Ent SP3, x64) and a DB generated this:

    Msg 8909, Level 16, State 1, Line 1

    Table error: Object ID 0, index ID -1, partition ID 0, alloc unit ID 0 (type Unknown), page ID (1:8088) contains an incorrect page ID in its page header. The PageId in the page header = (0:0).

    Msg 8998, Level 16, State 2, Line 1

    Page errors on the GAM, SGAM, or PFS pages prevent allocation integrity checks in database ID 77 pages from (1:8088) to (1:16175). See other errors for cause.

    Msg 8909, Level 16, State 1, Line 1

    Table error: Object ID 0, index ID -1, partition ID 0, alloc unit ID 0 (type Unknown), page ID (1:16176) contains an incorrect page ID in its page header. The PageId in the page header = (0:0).

    Msg 8998, Level 16, State 2, Line 1

    Page errors on the GAM, SGAM, or PFS pages prevent allocation integrity checks in database ID 77 pages from (1:16176) to (1:24263). See other errors for cause.

    Msg 8909, Level 16, State 1, Line 1

    Table error: Object ID 0, index ID -1, partition ID 0, alloc unit ID 0 (type Unknown), page ID (1:24264) contains an incorrect page ID in its page header. The PageId in the page header = (0:0).

    Msg 8998, Level 16, State 2, Line 1

    Page errors on the GAM, SGAM, or PFS pages prevent allocation integrity checks in database ID 77 pages from (1:24264) to (1:32351). See other errors for cause.

    CHECKDB found 6 allocation errors and 0 consistency errors not associated with any single object.

    CHECKDB found 6 allocation errors and 0 consistency errors in database 'DbSSSS'

    So far db is accessible by users and 'Select * from ...' works against all 91 tables. No repair attempted yet. I restored from prior backups as a different dbname on different SAN but same page errors. I appreciate your guidance to resolve this.

    Thanks in advance..

    Othman

  • You have what looks like errors in the allocation pages. Because of those errors, checkDB can't do a full check of the DB. There is no repair for this.

    Either restore a clean backup of the database along with all log backups (if your backup strategy is good, you'll be able to do that without data loss), or, if you don't have a clean backup then you're going to need to script all objects, export all data and recreate the DB.

    You may be lucky and the damaged allocation pages refer to empty sections of the DB, however that doesn't mean these are benign, they are a fairly serious form of corruption.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • You've got corruption in two PFS pages - this isn't repairable, even with undocumented DBCC options. You need to restore, and also figure out why this happened. Select * can work fine sometimes depending on table structure when this happens.

    Paul Randal
    CEO, SQLskills.com: Check out SQLskills online training!
    Blog:www.SQLskills.com/blogs/paul Twitter: @PaulRandal
    SQL MVP, Microsoft RD, Contributing Editor of TechNet Magazine
    Author of DBCC CHECKDB/repair (and other Storage Engine) code of SQL Server 2005

  • I'm curious. How did you determine that they were PFS pages, not GAM/SGAM? Something about the page ID or is it the size of the range of pages that can't be checked?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Have you checked for hardware errors on the server?

    Error 8998

    Joie Andrew
    "Since 1982"

  • Interesting - that Books Online page is wrong (it's based on a document I wrote while inside MS). A PFS page maps 8088 pages, not 7990. I'll get it fixed.

    Paul Randal
    CEO, SQLskills.com: Check out SQLskills online training!
    Blog:www.SQLskills.com/blogs/paul Twitter: @PaulRandal
    SQL MVP, Microsoft RD, Contributing Editor of TechNet Magazine
    Author of DBCC CHECKDB/repair (and other Storage Engine) code of SQL Server 2005

  • Gail - you mean you don't have all the allocation bitmap page IDs and intervals burned into your brain? Lucky you 🙂

    Paul Randal
    CEO, SQLskills.com: Check out SQLskills online training!
    Blog:www.SQLskills.com/blogs/paul Twitter: @PaulRandal
    SQL MVP, Microsoft RD, Contributing Editor of TechNet Magazine
    Author of DBCC CHECKDB/repair (and other Storage Engine) code of SQL Server 2005

  • Paul Randal (2/3/2010)


    Gail - you mean you don't have all the allocation bitmap page IDs and intervals burned into your brain? Lucky you 🙂

    Not yet, I'm working on it though.

    Can you tell a GAM from SGAM from PFS on the basis of page ID alone? For that matter, with an error like this, can you tell a GAM from an SGAM and, if so how?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • I should do a blog post at some point, or you should do the MCM class 🙂

    Paul Randal
    CEO, SQLskills.com: Check out SQLskills online training!
    Blog:www.SQLskills.com/blogs/paul Twitter: @PaulRandal
    SQL MVP, Microsoft RD, Contributing Editor of TechNet Magazine
    Author of DBCC CHECKDB/repair (and other Storage Engine) code of SQL Server 2005

  • Paul Randal (2/4/2010)


    I should do a blog post at some point

    Please?

    or you should do the MCM class 🙂

    Would love to. As soon as I persuade someone to pay for it for me or I win the lottery.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Gail / Paul ,

    appreciate your followup.

    As suggested:hehe:, I exported data to a new db but noticed that exporting VIEWs causes correponding TABLEs to be created so it seems they have to be scripted then explicitly created.

    Few questions:

    1)As DBCC CHECKDB wasn't run before, this page corruption seems to have been present since weeks ago:w00t:. Does this entail necessity of running CHECKDB on daily basis during afterhours to be aware of similar corruption at early stage?

    2)If required diagnostics arranged to be run against the affected SAN drive, can it also 'fix' or will it only expose damaged area?

    Thanks and Regards,

    Othman

  • abo_moaaz (2/4/2010)


    1)As DBCC CHECKDB wasn't run before, this page corruption seems to have been present since weeks ago:w00t:. Does this entail necessity of running CHECKDB on daily basis during afterhours to be aware of similar corruption at early stage?

    Absolutely! Maybe not daily, but often enough that, should corruption be detected, restore from a clean backup is always an option. Hence your backup strategy will determine the minimum frequency for CheckDB.

    For example, if you only retain 2 days of backups and hence cannot restore to further back than 2 days, checkDB must run at least every 2 days. Personally I'd recommend running it before you run your full backups. No point in backing up a corrupt DB.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Hi ,

    For further analysis :cool:(note that actual db with 8998 error was exported earlier with no loss of data and the old one was renamed ), with that said,

    1)If I run backup with checksum on the corrupted db, it still finishes successfully does this confirm that the corrupted area has no data?

    2) the storgae group informed me that as long as the SAN volume is accessible, from their side it's 'healthy'. Should we as a precaution, replace the volume by a new one ?

    Thanks and Regards

    Othman

  • 1) No. It just means that the corrupt pages don't have page checksums on them for some reason. Was the database upgraded from 2000?

    2) Your storage group are fools. Is a SAN volume with poor IO throughput 'healthy' just becase it's accessible? No. Is a SAN volume that has drives causing corruption 'healthy'? No. You need to educate the storage group of what healthy means - 'does not corrupt the data' is the number 1 requirement and it's not being met here. Just because a drive can be accessed in no way guarantees that there aren't integrity problems with it.

    Thanks

    Paul Randal
    CEO, SQLskills.com: Check out SQLskills online training!
    Blog:www.SQLskills.com/blogs/paul Twitter: @PaulRandal
    SQL MVP, Microsoft RD, Contributing Editor of TechNet Magazine
    Author of DBCC CHECKDB/repair (and other Storage Engine) code of SQL Server 2005

  • Hi Paul,

    just to let you know that i really enjoyed your teched video:-). As for 1) this was a pure SS2005 db (actually, one of the SharePorint DBs). For the storage group part, what i think they meant that their available tools may not tell them much other than accessibility or performance throughput. If it helps, i can mention the brand name of the storage.

    Another insight, it was determined early this week that the installed multipath s/w for the SAN was not compatible with the OS (WS2008 ENT R2 to which we upgraded last October), so it was yesterday upgraded to the compatible level, but the checkdb would still show 8998 on that db.

    Thanks and Regards

    Othman

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply