error 823

Question

error 823

george sibbald

SSC Guru

Points: 104210
More actions
February 18, 2011 at 6:26 am

#232274

OK, DBCC shows -
Msg 8966, Level 16, State 2, Line 1
Unable to read and latch page (1:986736) with latch type SH. 38(failed to retrieve text for this error. Reason: 15105) failed.
Msg 8998, Level 16, State 2, Line 1
Page errors on the GAM, SGAM, or PFS pages prevent allocation integrity checks in database ID 181 pages from (1:986736) to (1:994823). See other errors for cause.
CHECKDB found 2 allocation errors and 0 consistency errors not associated with any single object.
CHECKDB found 2 allocation errors and 0 consistency errors in database 'blah'
I am going to presume at this point I am snafu'd and this is irreparable, agreed?
---------------------------------------------------------------------

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 1

Restore from backup. Not repairable.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

george sibbald SSC Guru Points: 104210 More actions · Answer 2

thanks for the confirmation Gail. Have a good day (I'm not!)

---------------------------------------------------------------------

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 3

george sibbald (2/18/2011)
Have a good day (I'm not!)

No backups? :hehe:

Be sure to check the system, raid, SAN logs. Something did this, you don't want it to happen again and worse.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

george sibbald SSC Guru Points: 104210 More actions · Answer 4

its a new database created by an ETL process, database is corrupted as it creates so no chance to back it up!

It obviously can be recreated but new database corrupts every time, so we have an underlying problem.

:sick:

---------------------------------------------------------------------

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 5

george sibbald (2/18/2011)
It obviously can be recreated but new database corrupts every time, so we have an underlying problem.

A fairly serious one I would say. May I suggest putting the DB onto alternate storage if you can?

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

george sibbald SSC Guru Points: 104210 More actions · Answer 6

we just tried that, same error there! Oh dear. something server related or we just found a new bug. I'm calling in MS.

---------------------------------------------------------------------

ashok_a2004 SSC Enthusiast Points: 114 More actions · Answer 7

Hi ,

you try to check this what's this page : is this data page or index page ?

As per my experience It should be index page. You find the questioned index and drop it and then recreate it (don't go to recreate it without dropping it first).

Your problem would be resolved.

Thanks

Ashok

MCP|MCTS|MCITP|MBA IT|

Johan Bijnens SSC Guru Points: 135258 More actions · Answer 8

george sibbald (2/18/2011)
its a new database created by an ETL process, database is corrupted as it creates so no chance to back it up!
It obviously can be recreated but new database corrupts every time, so we have an underlying problem.
:sick:

Can you elaborate on how the ETL tool creates the db ?

If it would restore from a backup, the backup is corrupt.

Johan

Learn to play, play to learn !

Dont drive faster than your guardian angel can fly ...
but keeping both feet on the ground wont get you anywhere :w00t:

- How to post Performance Problems
- How to post data/code to get the best help[/url]

- How to prevent a sore throat after hours of presenting ppt

press F1 for solution, press shift+F1 for urgent solution 😀

Need a bit of Powershell? How about this

Who am I ? Sometimes this is me but most of the time this is me

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 9

ashok_a2004 (3/2/2011)
you try to check this what's this page : is this data page or index page ?

Neither, per the error message it's an allocation page, a page that tracks page or extent allocations

As per my experience It should be index page. You find the questioned index and drop it and then recreate it (don't go to recreate it without dropping it first).

Not going to help here. If the problem was in a nonclustered index, the repair level would be repair_rebuild. It is not. The repair level is missing, meaning it's an irreparable error

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

george sibbald SSC Guru Points: 104210 More actions · Answer 10

thanks for the interest guys and sorry i have not come back but it has a madhouse here (I should be working now rather than replying to this :-)).

The databases are built from scripts (not restore), SSIS loads data and some manipulation occurs after the load. It is at this point the corruption occurs.

Still don't know the cause as the server is now so fritzed cannot even get the info out of it MS want. Currently rebuilding new environments! We think it might be in the cluster layer (this is a GEO_Cluster) so rebuilding standalone.

Might never get an answer to this but if we do I will post back

---------------------------------------------------------------------

george sibbald SSC Guru Points: 104210 More actions · Answer 11

thought I would feed back on the root cause of this.

Investigations of the disk showed no errors on the disks, which is unusual for an 823 error but there you are. Interesting thing was when the service is restarted the corrupted dbs list a 5125 error on recovery -

.mdf' appears to have been truncated by the operating system. Expected size is 7905664 KB but actual size is 7887736 KB.

file looks ok in sys.master_files and via explorer, but a clue there that the file is shortened in some way.

Turned out to be disk keeper, the tool we use to defragment at the OS level, and runs while SQL is active. Turn it off and no problems. We use this tool throughout our environment without issues but something about this build causes problems (windows 2008, SQL2008R2 Geo-cluster, SAN replication, very high IO rates.)

Its with disk keeper now to see if there is a fix but meanwhile I have had to rebuild their servers as standalone, which was a right royal pain.

---------------------------------------------------------------------

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 12

Another Diskeeper-related corruption problem... http://sqlskills.com/BLOGS/PAUL/post/Diskeeper-10-Intelliwrite-corruption-bug.aspx

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

george sibbald SSC Guru Points: 104210 More actions · Answer 13

thank you Gail, some definite similarities there, especially in the 823, 8966 errors and apparent EOF conditions. Main difference was my databases WERE actually corrupted.

I will pass this on to our server admins who are liaising with disk keeper.

---------------------------------------------------------------------

george sibbald SSC Guru Points: 104210 More actions · Answer 14

root cause -

After speaking with Diskeeper they advised us to install Diskeeper 10 version 14.0.913 which it the latest build, we have installed this and retested and we have not seen any issues since. We now plan to upgrade the rest of the SQL boxes with Diskeeper installed. The un-install and reinstall does not require a reboot.

The above patch is the same one disk keeper recommend for the dbcc checkdb issue Gail referred to.

Moral is to be careful if you intend to upgrade to version 10 of diskkeeper on a SQL box (this problem was introduced with version 10) and ensure you are patched to at least the version above.

---------------------------------------------------------------------