Need help--very urgent..

  • Hi All,

    Here's issue. I am not able to backup one of our production database. This is the error it gives...

    BackupIoRequest::WaitForIoCompletion: read failure on backup device 'E:\Database File\abc.mdf'. Operating system error 23(Data error (cyclic redundancy check).).

    What can I do? This is an urgent production issue....Please let us know your view asap....

  • RPSql (5/11/2009)


    read failure on backup device 'E:\Database File\abc.mdf'. Operating system error 23(Data error (cyclic redundancy check).).

    quote]

    Looks like .MDF file is currupted. Run DBCC CHECKDB.

  • I would agree that this points to a physical disk problem on your server. Your only recourse may be to recover from your previous backup at this point. Check your SQL Server Error Log for 824 and 825 errors in the log. Run drive diagnostics and then CHECKDB to find the extent of the damage.

    Jonathan Kehayias | Principal Consultant | MCM: SQL Server 2008
    My Blog | Twitter | MVP Profile
    Training | Consulting | Become a SQLskills Insider
    Troubleshooting SQL Server: A Guide for Accidental DBAs[/url]

  • Please run the following and post all the results here.

    DBCC CHECKDB (< Database Name > ) WITH NO_INFOMSGS, ALL_ERRORMSGS

    Take a look at this article. http://www.sqlservercentral.com/articles/65804/

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • The Database size is 320 GB. I have started to run DBCC Checkdb on that database yesterday. It is running since 20 hours....Any idea why it is taking so long and why it is not completed yet?

    In activity monitor it is showing Wait type FCB_REPLICA_WRITE. What can I do now?????

    Please answer as it is a production issue and our application is down currently.

    Thanks in advance.

  • Unless you are normally running CHECKDB (a recommended practice to catch these problems early on and reduce downtime/risk of data loss) and can look at the historical runs for how long it takes to run, there isn't a whole lot that you can do except wait it out. If you stop it, you won't get the information that you need to help resolve the problems.

    Jonathan Kehayias | Principal Consultant | MCM: SQL Server 2008
    My Blog | Twitter | MVP Profile
    Training | Consulting | Become a SQLskills Insider
    Troubleshooting SQL Server: A Guide for Accidental DBAs[/url]

  • Thanks for your reply. I was littlebit worried as this database is not much large, but it's already a day since I am running this process...It's really taking so long....The thread in activity monitor is suspended and wait type is 'FCB_REPLICA_WRITE'.

    Is this normal?

  • Per the Book Online, that wait type signals the following:

    Occurs when the pushing or pulling of a page to a snapshot (or a temporary snapshot created by DBCC) sparse file is synchronized.

    http://msdn.microsoft.com/en-us/library/ms179984.aspx

    Based on that I'd say yes it is normal.

    Jonathan Kehayias | Principal Consultant | MCM: SQL Server 2008
    My Blog | Twitter | MVP Profile
    Training | Consulting | Become a SQLskills Insider
    Troubleshooting SQL Server: A Guide for Accidental DBAs[/url]

  • RPSql (5/12/2009)


    It is running since 20 hours....Any idea why it is taking so long and why it is not completed yet?

    Maybe because there is corruption.

    The checkDB algorithms are written in such a way that they can tell quickly if there is corruption or not, but if there is, then SQL has to go back and do extra detailed searches. it's called a 'deep-dive' and it can make the CheckDB time go up massively.

    Wait until it's finished. To tell what's wrong we need the results. If you stop it now you're just going to have to run it to completion later.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • The file might be a corrupted one

    [font="Comic Sans MS"]+++BLADE+++[/font]:cool:

  • Just a thought too, but while the DBCC continues you might want to start looking at backups and ensuring that you have one available and ready to go if the corruption is such that it can't be fixed. So, find the most recent good backup and get it available (if on tape, get it off tape). This can save you some time at the end of this process and allow you to get back online faster if you do have to perform a restore.

    Hopefully you won't have to use it though.... 🙂

    David

    @SQLTentmaker

    “He is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot

  • Hi Gilamonster,

    Thanks for your help. I will wait till this process complete and so as our application team..I will post the output here, please help if you can?

    One more thing, I have the last backup on this database which is four days older. We didn't get notified as we are using IBM Tivoli to take SQL Backups directly to Tape. Our Storage people notify us after yesterday which is after 3 days!!!!...

    Thanks again..

  • David Benoit (5/12/2009)


    Just a thought too, but while the DBCC continues you might want to start looking at backups and ensuring that you have one available and ready to go if the corruption is such that it can't be fixed. So, find the most recent good backup and get it available (if on tape, get it off tape). This can save you some time at the end of this process and allow you to get back online faster if you do have to perform a restore.

    Hopefully you won't have to use it though.... 🙂

    You might look at replacement hardware as well. CRC failures are generally physical disk failure and requires replacing the bad disk(s) to rectify the problem.

    Jonathan Kehayias | Principal Consultant | MCM: SQL Server 2008
    My Blog | Twitter | MVP Profile
    Training | Consulting | Become a SQLskills Insider
    Troubleshooting SQL Server: A Guide for Accidental DBAs[/url]

  • Yeah, I have a report that looks at last day of a database backup to avoid things like that. Hopefully you won't have to use the backup. Regardless, have them make sure the tape is available.

    David

    @SQLTentmaker

    “He is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot

  • David Benoit (5/12/2009)


    Just a thought too, but while the DBCC continues you might want to start looking at backups and ensuring that you have one available and ready to go if the corruption is such that it can't be fixed. So, find the most recent good backup and get it available (if on tape, get it off tape). This can save you some time at the end of this process and allow you to get back online faster if you do have to perform a restore.

    I'd also start checking system event logs, RAID controller/SAN logs, etc. Corruption's usually an IO problem.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 15 posts - 1 through 15 (of 34 total)

You must be logged in to reply to this topic. Login to reply