SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Zeroing out bad block?


Zeroing out bad block?

Author
Message
Mick Opalak
Mick Opalak
SSC-Enthusiastic
SSC-Enthusiastic (107 reputation)SSC-Enthusiastic (107 reputation)SSC-Enthusiastic (107 reputation)SSC-Enthusiastic (107 reputation)SSC-Enthusiastic (107 reputation)SSC-Enthusiastic (107 reputation)SSC-Enthusiastic (107 reputation)SSC-Enthusiastic (107 reputation)

Group: General Forum Members
Points: 107 Visits: 328
I have a 2005, SP4, 64 bit, Enterprise edition server. Our Netapp crashed suddenly and when it was brought back up DBCC CHECKDB found a corruption in a table. After dropping an index and rebuilding it the CRC errors stopped. Problem solved, except that BACKUPS now fail for this database (with the same CRC errors) when the process gets to the bad block in the data file. Netapp is proposing that they zero out that block (overwrite the block with all zeros) to correct this.

Will this work and will it affect the database in any way? My initial thought is that the database should be fine as this block is not currently in use by any objects, but my boss is afraid of further corrupting the data file.

Your thoughts?
GilaMonster
GilaMonster
SSC Guru
SSC Guru (87K reputation)SSC Guru (87K reputation)SSC Guru (87K reputation)SSC Guru (87K reputation)SSC Guru (87K reputation)SSC Guru (87K reputation)SSC Guru (87K reputation)SSC Guru (87K reputation)

Group: General Forum Members
Points: 87243 Visits: 45272
If the block belongs to an allocated page, that will cause further corruption.

What's the error on the backups and what does the following return?

DBCC CHECKDB (<Database Name>Wink WITH NO_INFOMSGS, ALL_ERRORMSGS



Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass


Lowell
Lowell
One Orange Chip
One Orange Chip (28K reputation)One Orange Chip (28K reputation)One Orange Chip (28K reputation)One Orange Chip (28K reputation)One Orange Chip (28K reputation)One Orange Chip (28K reputation)One Orange Chip (28K reputation)One Orange Chip (28K reputation)

Group: General Forum Members
Points: 28173 Visits: 39951
bad idea. bad bad.

first, don't do anything like take the database offline or anything without expert advice; Gail Shaw monitors this forum frequently..
run DBCC CHECKDB('db name') WITH NO_INFOMSGS, ALL_ERRORMSGS and post the results here.

with that, we can tell you a lot more.

you might be able to do anything from page level restore to get the data back, maybe migrate the data out to a new table to fix it, and many other possibilities.there's just too many possibilities without .

letting ANYTHING other than SQL server touch the bits and bytes of the data will guarantee corruption and loss of data.

Lowell

--
help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

Mick Opalak
Mick Opalak
SSC-Enthusiastic
SSC-Enthusiastic (107 reputation)SSC-Enthusiastic (107 reputation)SSC-Enthusiastic (107 reputation)SSC-Enthusiastic (107 reputation)SSC-Enthusiastic (107 reputation)SSC-Enthusiastic (107 reputation)SSC-Enthusiastic (107 reputation)SSC-Enthusiastic (107 reputation)

Group: General Forum Members
Points: 107 Visits: 328
Here's the error from the Backup:
BackupIoRequest::WaitForIoCompletion: read failure on backup device 'T:\Data\T2KAdmin.mdf'. Operating system error 23(Data error (cyclic redundancy check).).

We're going to be running a DBCC CHECKDB but it takes 8 hours to complete (305 GB DB).

We had another idea: since this page is now not used by any object since rebuilding the index, if we shrink the data file that disk block could be reallocated back to the operating system and not be in the data file at all any longer. Then our backups should work, right?
Steve Jones
Steve Jones
SSC Guru
SSC Guru (62K reputation)SSC Guru (62K reputation)SSC Guru (62K reputation)SSC Guru (62K reputation)SSC Guru (62K reputation)SSC Guru (62K reputation)SSC Guru (62K reputation)SSC Guru (62K reputation)

Group: Administrators
Points: 62488 Visits: 19103
It depends on where the block is. If you have this:


data data data data data bad block (free space) (free space)

you could be fine. If you have this:

data data (bad block) (free space) (data) (data) (free space)

you might end up sending another object to the bad block. SQL Server isn't intelligent about the shrink. It reduces the size of the file, moving extents from the end to the beginning.

Ideally, I'd think about copying the mdf file over, but I'd be worried about shutting down the server right now. At the least, I'd get dbcc to finish, and I'd also look at bcp'ing out all data, and I mean all data into files. Drop each table into a separate file and check them for integrity. Worst case, you have to BCP back in, but that's easily scriptable.

I'd also script out all schema. Something like Red Gate's SQL Compare makes this really easy and you could transfer stuff to a shell db if you have the tool.

disclosure: I work for Red Gate.

Follow me on Twitter: @way0utwest
Forum Etiquette: How to post data/code on a forum to get the best help
My Blog: www.voiceofthedba.com
GilaMonster
GilaMonster
SSC Guru
SSC Guru (87K reputation)SSC Guru (87K reputation)SSC Guru (87K reputation)SSC Guru (87K reputation)SSC Guru (87K reputation)SSC Guru (87K reputation)SSC Guru (87K reputation)SSC Guru (87K reputation)

Group: General Forum Members
Points: 87243 Visits: 45272
Maybe, but I wouldn't want to take any chances with this. You don't know for sure that the page is unallocated. If it wasn't allocated, the backup shouldn't be breaking as backups only read allocated extents and if the only corruption was in that index, dropping the index would have deallocated the extent.

p.s. CheckDB Taking 8 hours on a 300GB database? That sounds like a serious IO contention problem in and of itself. I've done checkDB of TB sized databases faster than that.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass


Mick Opalak
Mick Opalak
SSC-Enthusiastic
SSC-Enthusiastic (107 reputation)SSC-Enthusiastic (107 reputation)SSC-Enthusiastic (107 reputation)SSC-Enthusiastic (107 reputation)SSC-Enthusiastic (107 reputation)SSC-Enthusiastic (107 reputation)SSC-Enthusiastic (107 reputation)SSC-Enthusiastic (107 reputation)

Group: General Forum Members
Points: 107 Visits: 328
GilaMonster (5/31/2012)
CheckDB Taking 8 hours on a 300GB database? That sounds like a serious IO contention problem in and of itself. I've done checkDB of TB sized databases faster than that.


I suspect that there are still Netapp issues to be resolved. Since the storage crash and recovery the server has been running at approximately 75% CPU while before the crash it was the normal 20% - 40%.
GilaMonster
GilaMonster
SSC Guru
SSC Guru (87K reputation)SSC Guru (87K reputation)SSC Guru (87K reputation)SSC Guru (87K reputation)SSC Guru (87K reputation)SSC Guru (87K reputation)SSC Guru (87K reputation)SSC Guru (87K reputation)

Group: General Forum Members
Points: 87243 Visits: 45272
May I suggest alternate storage if at all possible...

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass


Ben Thul
Ben Thul
SSC-Enthusiastic
SSC-Enthusiastic (116 reputation)SSC-Enthusiastic (116 reputation)SSC-Enthusiastic (116 reputation)SSC-Enthusiastic (116 reputation)SSC-Enthusiastic (116 reputation)SSC-Enthusiastic (116 reputation)SSC-Enthusiastic (116 reputation)SSC-Enthusiastic (116 reputation)

Group: General Forum Members
Points: 116 Visits: 468
Are you using Snap Manager for SQL? If so, I'd suggest trying to do a native backup to see if it's NetApp that's misinformed regarding the state of that block (it did crash, after all) or whether there's truly something wrong from SQL's perspective. If the native backup works, try restoring it somewhere else (same instance would be fine, but I'd suggest a different LUN, on different drives if possible). If *that* works, I would try for a cutover to that new db (after restoring again and applying logs as necessary).

Good luck. I've dealt with NetApp and SQL Server and will say that it has a bit of a Pinocchio syndrome: when it's good, it's very good. But when it's bad...
Mick Opalak
Mick Opalak
SSC-Enthusiastic
SSC-Enthusiastic (107 reputation)SSC-Enthusiastic (107 reputation)SSC-Enthusiastic (107 reputation)SSC-Enthusiastic (107 reputation)SSC-Enthusiastic (107 reputation)SSC-Enthusiastic (107 reputation)SSC-Enthusiastic (107 reputation)SSC-Enthusiastic (107 reputation)

Group: General Forum Members
Points: 107 Visits: 328
Unfortunately is the native SQL back up that is failing.
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search