Dbcc checkdb says 0 errors but SQL error 823 produced

  • The db is working as normal, but in the SQL logs I see these 3 messages

    The operating system returned error incorrect checksum (expected: 0x1b0a0fbe; actual: 0x1b0a0fbe) to SQL Server during a read at offset 0x00000ae7e9c000 in file 'D:\Data\MyData.mdf'. Additional messages in the SQL Server error log and system event log may provide more detail. This is a severe system-level error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.

    Error: 823, Severity: 24, State: 7.

    Operating system error 'incorrect checksum (expected: 0x1b0a0fbe; actual: 0x1b0a0fbe)' resulted from attempt to read the following: sort run page (3:5717838), in file 'D:\Data\MuData_data.mdf', in database with ID 23. Sort is retrying the read.

    I've run dbcc checkdb(MyData) but that said 0 errors found 0 errors repaired.

    This is a sql 2005 instance and the DB page verify is set to checksum

    Any ideas from a SQL viewpoint? whilst I also ask the infrastructure team to check the D: drive

  • It's a sort page, so temporarily allocated for a sort operation, probably an index rebuild, and deallocated afterwards. Hence by the time CheckDB ran, the page had long since been deallocated. Deallocated pages can't be checked with checkDB as they are not part of the consistent database structure.

    That said, something's up with the IO subsystem if that could happen at all, check carefully.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • thanks for the reply.

    Yes the times this has been noted is during an maintenance task that rebuilds indexes and when new application objects are deployed/compiled.

    So my understanding is that we have no immediate danager as dbcc checkdb says all ok, but a 'small' problem on the disk somewhere so does need to be investigate as quickly as possible.

    Can you elaborate on 'check carefully', I mean is there anything from SQL I can do or just ask our hardware team to fully test the D: drive.

    Thanks

  • Not necessarily a small problem on the disk, you can't conclude that. Could be a huge problem that's only shown up once so far.

    No, nothing really from SQL. Check windows logs, check raid logs, san logs, etc.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Hi Gail,

    I have an update from my hardware guys.

    The D: drive is very fragmented and chdkdsk failed, so their solution is to create a new LUN on clean disks format as X: then take SQL offline, copy the entire contents of D: (just MDf/NDF files really) to X: then drop D: rename X: to D: then bring SQL back online.

    I wanted to use detach/attach as we have many Db's for different countries so downtime would be just for that country being moved rather than server down until all are copied, although my way would mean we cannot use D: drive

    Thanks for youe help.

  • Make sure you have backups of all of those databases first.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Hi,

    Yes I am asking/insisting for this (as their is a fussy line as to responsilbilites due to geography and where its a SQL or infrastructure issue) the last reply I had from them was this

    As we copy the data and do not write anything to the LUN in question, I see no need for (extra)backups – the risk of the old LUN to fail ultimately is not bigger as right now in this moment or any other.

    He misses the whole point of a backup.

    I shall insist though. 🙂

  • Ask him if he wants to bet his job on there being no additional problems (because without backups, that's what you would be doing)

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (4/24/2012)


    Ask him if he wants to bet his job on there being no additional problems (because without backups, that's what you would be doing)

    +1000 Nobody has ever gotten fired for having backups and not needing them!

    But then why are they even involved in the backup process. That should be the DBA's duty.


    My blog: SQL Soldier[/url]
    SQL Server Best Practices:
    SQL Server Best Practices
    Twitter: @SQLSoldier
    My book: Pro SQL Server 2008 Mirroring[/url]
    Microsoft Certified Master: SQL Server, Data Platform MVP
    Database Engineer at BlueMountain Capital Management[/url]

  • sotn ,

    I'm curious if moving your data to a new LUN has solved your issue. I'm running into the same issue with checksums on my tempdb and like you my expected and actual checksums are matching in the output. I'm using a SAN so wondering if I just create a new drive partition from my datastore and move all my dbs over. Thanks!

  • Sometimes due to high page level corruption the DBCC CHECKDB command refused to repair the SQL database and shows consistency failure errors. So if you have good backup then restore it otherwise in this condition to resolve it the use of external agent is more admirable.

    SSMS Expert

  • I've run dbcc checkdb(MyData) but that said 0 errors found 0 errors repaired.

    Did you disconnect all users/connections from the database first?

    http://msdn.microsoft.com/en-us/library/ms345598(v=sql.105).aspx"> http://msdn.microsoft.com/en-us/library/ms345598(v=sql.105).aspx

    I imagine there are sections that dbcc could have missed. It could have been a temporary allocation of memory as mentioned earlier and the bad sector being deallocated for the moment as Gail mentioned.

    ----------------------------------------------------

  • Please note: 2 year old thread.

    Edit: and checkDB does not require users to be disconnected, it's an online operation.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • I did find something interesting on Books Online regarding CHECKDB : Just wanted to note here >

    "databases that are created on SQL Server 2005 and later should never contain incorrect counts; however, databases that are upgraded to SQL Server 2005 and later might. "

    Wow, another fine detail to memorize 🙂

    http://msdn.microsoft.com/en-us/library/ms176064(v=sql.105).aspx

    ----------------------------------------------------

  • Hence the standard recommendation to run DBCC UpdateUsage when upgrading a DB. And 2008 still has incorrect counts occasionally.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply