Help with DB corruption on SQL 2012

Question

Post reply

Help with DB corruption on SQL 2012

ganeshharidas

SSC-Addicted

Points: 410
More actions
February 21, 2016 at 7:03 pm

#324438

1)
My DB is corrupt with this error:
Msg 8947, Level 16, State 1, Line 1
Table error: Multiple IAM pages for object ID 318624178, index ID 1, partition ID 72057594663600128, alloc unit ID 72057594787332096 (type In-row data) contain allocations for the same interval. IAM pages (1:2432) and (1:17430).
DBCC results for 'APP_ScReplicationPair'.
There are 544 rows in 6 pages for object "APP_ScReplicationPair".
CHECKDB found 1 allocation errors and 0 consistency errors in table 'APP_ScReplicationPair' (object ID 318624178).
...
CHECKDB found 1 allocation errors and 0 consistency errors in database 'sql.CommServ.2'.
repair_allow_data_loss is the minimum repair level for the errors found by DBCC CHECKDB (sql.CommServ.2).
DBCC execution completed. If DBCC printed error messages, contact your system administrator.
Full CHECKDB output here: http://pastebin.com/GqQi0nYS
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
2)
CHECKDB with repair_allow_data_loss failed with this error:
DBCC results for 'APP_ScReplicationPair'.
Repair: Page (0:0) next and (1:2432) previous pointers have been set to match each other in object ID 318624178, index ID 1, partition ID 72057594663600128, alloc unit ID 72057594787332096 (type In-row data).
Repair: IAM chain for object ID 318624178, index ID 1, partition ID 72057594663600128, alloc unit ID 72057594787332096 (type In-row data), has been truncated before page (1:2432) and will be rebuilt.
Msg 8947, Level 16, State 1, Line 1
Table error: Multiple IAM pages for object ID 318624178, index ID 1, partition ID 72057594663600128, alloc unit ID 72057594787332096 (type In-row data) contain allocations for the same interval. IAM pages (1:2432) and (1:17430).
Could not repair this error.
Msg 8969, Level 16, State 2, Line 1
Table error: IAM chain linkage error: Object ID 318624178, index ID 1, partition ID 72057594663600128, alloc unit ID 72057594787332096 (type In-row data). The next page for IAM page (1:17430) is (1:2432), but the previous link for page (1:2432) is (0:0).
There are 544 rows in 6 pages for object "APP_ScReplicationPair".
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
3)
I tried a simple rebuild of the corrupt clustered index. It suceeeded, CHECKDB is not reporting corruption anymore.
Is my DB really fixed? Is this a legitimate way to fix DB corruption?
(Row count per sys.partitions does match before and after the index rebuild. Since there is no catalog corruption, I am hoping this information is correct. Is the data intact - I am not sure because I don't have good DB to compare the repaired table with).
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
4)
Corrupt DB:
PageFID PagePID IAMFID IAMPID ObjectID IndexID PartitionNumber PartitionID iam_chain_type PageType IndexLevel NextPageFID NextPagePID PrevPageFID PrevPagePID
------- ----------- ------ ----------- ----------- ----------- --------------- -------------------- -------------------- -------- ---------- ----------- ----------- ----------- -----------
1 17430 NULL NULL 318624178 1 1 72057594663600128 In-row data 10 NULL 1 2432 0 0
1 17431 1 17430 318624178 1 1 72057594663600128 In-row data 2 1 0 0 0 0
1 22062 1 17430 318624178 1 1 72057594663600128 In-row data 1 0 0 0 1 504833
1 22076 1 17430 318624178 1 1 72057594663600128 In-row data 1 0 1 504833 1 22078
1 22078 1 17430 318624178 1 1 72057594663600128 In-row data 1 0 1 22076 1 504832
1 22080 1 17430 318624178 1 1 72057594663600128 In-row data 1 0 1 504832 0 0
1 2432 NULL NULL 318624178 1 1 72057594663600128 In-row data 10 NULL 0 0 1 17430
1 504832 1 2432 318624178 1 1 72057594663600128 In-row data 1 0 1 22078 1 22080
1 504833 1 2432 318624178 1 1 72057594663600128 In-row data 1 0 1 22062 1 22076
If I DBCC PAGE the 2 IAM pages above, I see the actual corruption, "start_pg" in both IAM pages points to "(1:0)".
DBCC IND on the fixed DB ("fixed" thru rebuild index) has only one IAM page. Looks good so far.
PageFID PagePID IAMFID IAMPID ObjectID IndexID PartitionNumber PartitionID iam_chain_type PageType IndexLevel NextPageFID NextPagePID PrevPageFID PrevPagePID
------- ----------- ------ ----------- ----------- ----------- --------------- -------------------- -------------------- -------- ---------- ----------- ----------- ----------- -----------
1 1908 NULL NULL 318624178 1 1 72057594668056576 In-row data 10 NULL 0 0 0 0
1 1910 1 1908 318624178 1 1 72057594668056576 In-row data 2 1 0 0 0 0
1 11072 1 1908 318624178 1 1 72057594668056576 In-row data 1 0 1 11073 0 0
1 11073 1 1908 318624178 1 1 72057594668056576 In-row data 1 0 1 11074 1 11072
1 11074 1 1908 318624178 1 1 72057594668056576 In-row data 1 0 0 0 1 11073
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
5)
Now here is something weird:
From the DBCC IND outputs above you can see the objectid with the problem is 318624178. But DBCC PAGE on the IAM page of the fixed DB has a different ObjectId:
PAGE: (1:1908)
....
Metadata: ObjectId = 173243672 m_prevPage = (0:0) m_nextPage = (0:0)
....
Full DBCC PAGE output here: http://pastebin.com/KmBy2nf4
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
6)
This ObjectId 173243672 is obviously a different table. Here is some info on that:
select object_name(173243672)
>> RMDataPathView
PageFID PagePID IAMFID IAMPID ObjectID IndexID PartitionNumber PartitionID iam_chain_type PageType IndexLevel NextPageFID NextPagePID PrevPageFID PrevPagePID
------- ----------- ------ ----------- ----------- ----------- --------------- -------------------- -------------------- -------- ---------- ----------- ----------- ----------- -----------
1 40428 NULL NULL 173243672 0 1 72057594097893376 In-row data 10 NULL 0 0 0 0
1 40426 1 40428 173243672 0 1 72057594097893376 In-row data 1 0 0 0 0 0
This guy's IAM expectedly has his own objectid:
PAGE: (1:40428)
....
Metadata: ObjectId = 173243672 m_prevPage = (0:0) m_nextPage = (0:0)
...
Full DBCC PAGE output here: http://pastebin.com/QRweb0yg
So, DBCC IND associates two IAM pages with two different objects, yet DBCC PAGE of those IAM pages associates them with the same ObjectId? How can that be?

Viewing 6 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply

Gail Shaw SSC Guru Points: 1004485 More actions · Answer 1

Rebuilding a clustered index may fix it, if the error is in the allocation pages as it appeared to be.

If you run CheckDB WITH NO_INFOMSGS, does it return any errors?

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

ganeshharidas SSC-Addicted Points: 410 More actions · Answer 2

CheckDB WITH NO_INFOMSGS on the corrupt DB says this:

Msg 8947, Level 16, State 1, Line 1

Table error: Multiple IAM pages for object ID 318624178, index ID 1, partition ID 72057594663600128, alloc unit ID 72057594787332096 (type In-row data) contain allocations for the same interval. IAM pages (1:2432) and (1:17430).

CHECKDB found 1 allocation errors and 0 consistency errors in table 'APP_ScReplicationPair' (object ID 318624178).

CHECKDB found 1 allocation errors and 0 consistency errors in database 'sql.CommServ.2'.

repair_allow_data_loss is the minimum repair level for the errors found by DBCC CHECKDB (sql.CommServ.2).

No errors on the repaired DB.

ganeshharidas SSC-Addicted Points: 410 More actions · Answer 3

Any insight is appreciated.

Also, how does a clustered index rebuild work? Is it immune to IAM chain corruptions? If yes, I still don't get why CHECKDB would return repair_allow_data_loss and not repair_rebuild.

Gail Shaw SSC Guru Points: 1004485 More actions · Answer 4

So the rebuild didn't fix it then.

There are two ways SQL can read an index, first using the IAM, second using the index root page and the next page pointers. The first will trip over IAM corruption, the second won't.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

ganeshharidas SSC-Addicted Points: 410 More actions · Answer 5

"So the rebuild didn't fix it then."

Are you saying so because DBCC PAGE of the rebuilt index points to a different ObjectId? Note that CHECKDB after index rebuild doesn't report any errors.

I have two questions:

1) Can we derive any conclusion at all from the fact that "DBCC PAGE of the rebuilt index points to a different ObjectId". Is a way to prove/disprove that the DB is still corrupt from this?

2) If rebuilding the index really got rid of the corruption, shouldn't CHECKDB say repair_rebuild rather than repair_allow_data_loss as the minimum repair level.

2a) As I said before, even repair_allow_data_loss said it "Could not repair this error." So, it tried to rebuild index the IAM chain way and not the data page way, even though the data page way could cure the corruption?