Corruption Detection

  • Comments posted to this topic are about the item Corruption Detection

  • Learned something new, and that's not easy on a monday morning!

    Thanks.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • I'd certainly go for Page Checksums, but do MS actually recommend that? Is it inferred that they are better because they also include torn page detection?

  • paul s-306273 (1/24/2011)


    I'd certainly go for Page Checksums, but do MS actually recommend that?

    Yes.

    Is it inferred that they are better because they also include torn page detection?

    They don't include torn page protection, but they can pick up everything that the torn page could and far more. Torn page wouldn't notice that 10 bytes within the page were overwritten (unless those 10 included one set of torn page bits), checksum will.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • again something new for me to learn.

  • GilaMonster (1/24/2011)


    Torn page wouldn't notice that 10 bytes within the page were overwritten (unless those 10 included one set of torn page bits), checksum will.

    This sounds like an interesting event that is on topic.

    What caused this to happen? How did you discover and recover from it?

  • Nice easy question for a nice easy number.

  • SanDroid (1/24/2011)


    What caused this to happen? How did you discover and recover from it?

    Errr...??? I'm saying how it behaves, not describing a situation I encountered. I can manufacture an error like that if you want.

    Recovery would depend on what else is damaged. Could be repair, could be restore (full, filegroup, file or page), could be fixable by dropping an index.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (1/24/2011)


    SanDroid (1/24/2011)


    What caused this to happen? How did you discover and recover from it?

    Errr...??? I'm saying how it behaves, not describing a situation I encountered. I can manufacture an error like that if you want.

    If your could give a real world example of what could cause this to occur, and what the symptom would be.

    Who could ask for a better learning experiance than that?

  • Corruption usually occurs from some hardware issue. A voltage fluctuation, a bad driver, some anomaly that cases the bits written to disk to be incorrect.

  • For once I am glad that I knew an answer to a question without having to look it up to even confirm it.

    I don't remember why, but I wound up reading extensively about the different ways to detecting page corruption... I think when I was considering using Checksum for detecting changes in data during the ETL process of our DW. And I hit this instead, and started reading.

    Some days BOL is an interesting read... I must have read about that in depth for an hour or two.



    --Mark Tassin
    MCITP - SQL Server DBA
    Proud member of the Anti-RBAR alliance.
    For help with Performance click this link[/url]
    For tips on how to post your problems[/url]

  • Steve Jones - SSC Editor (1/24/2011)


    Corruption usually occurs from some hardware issue. A voltage fluctuation, a bad driver, some anomaly that cases the bits written to disk to be incorrect.

    I see. So what Gila was trying to say that sometimes a Disc driver could overwrite bytes in a page with data that shold have been written to a text file. Torn Page detection would not catch this overwrite, but check sum would.

  • BOL says checksum is moderately resource intensive. Will that affect performance?

  • The overhead is listed at 1-2% overhead from what I saw in a presentation by Paul Randal.

    If you can't spare that, you have capacity issues.

  • SanDroid (1/24/2011)


    Steve Jones - SSC Editor (1/24/2011)


    Corruption usually occurs from some hardware issue. A voltage fluctuation, a bad driver, some anomaly that cases the bits written to disk to be incorrect.

    I see. So what Gila was trying to say that sometimes a Disc driver could overwrite bytes in a page with data that shold have been written to a text file. Torn Page detection would not catch this overwrite, but check sum would.

    That's possible, but it's not necessarily that. The disk might receive "1234567" from SQL Server and for some reason, as it writes the bytes on disk, it writes 1234566". The page checksum could potentially catch this. Not fix it, but let you know that your subsystem had an error.

Viewing 15 posts - 1 through 15 (of 25 total)

You must be logged in to reply to this topic. Login to reply