In the Real World - Lost Data Chains

  • Comments posted to this topic are about the content posted at http://www.sqlservercentral.com/columnists/sjones/intherealworldlostdatachains.asp

  • Hi Steve

    Interesting problem. I ran a forum post on a similar issue I had some time back. My issue was similar, in the fact that some queries were fine, and others would simply hang, debugging was slightly easier as the "hang" ended up in a disconnection. A DBCC CHECKDB showed up the issue, and in your case was perhaps the way to go early on. The fix is also interesting, in my case a reboot was fine, and havent had the problem since, I tried re-indexing with no luck (seemed to shift the problem elsewhere which really scared me).

    BTW, do you run prod with any particular trace flags enabled?

    Cheers

    Ck


    Chris Kempster
    www.chriskempster.com
    Author of "SQL Server Backup, Recovery & Troubleshooting"
    Author of "SQL Server 2k for the Oracle DBA"

  • Steve,

    Quite an interesting story. Sounds like the week from Hell. I admire your persistance in solving this problem.

    I suspect SQL Server was able to rebuild the page pointers by dropping them all, then relinking with the IAMs.

    Thanks for the guidance on how remove a nasty thorn from one's SQL Server paw.

    Larry Ansley

    Atlanta, GA


    Larry Ansley
    Atlanta, GA

  • Great piece of detective work Steve 😀

    We've a similar problem on a prod. server which occurs once every 4-6 weeks and is proving very elusive to nail down, so may check out (re)building clustered indexes first. Interestingly, we've already run dbcc checkdb, but it's reported no problems, though the symptoms sound suspiciously similar to yours. Thank you for a thought provoking article.

    Edited by - jonreade on 06/17/2003 01:58:10 AM


    Jon

  • I didn't have any trace flags, though I probably should have. I did run a checkdb and it didn't show errors.

    So far it hasn't appeared again and I hope it doesn't.

    Steve Jones

    sjones@sqlservercentral.com

    http://www.sqlservercentral.com/columnists/sjones

    http://www.dkranch.net

  • really a strange case. RID n:nnnnn error\ index chain error should have been generated when checkdb\checktable was executed.

    anyway, thanks for the info

  • I would have expected this as well.

    Oh well.

    Steve Jones

    sjones@sqlservercentral.com

    http://www.sqlservercentral.com/columnists/sjones

    http://www.dkranch.net

  • Wow! I would love to work with you for awhile.  I know I could learn a great deal.

  • Steve, here is my story about PK with duplicates--------------

    dbcc checkTABLE(tblABC)  

    Server: Msg 8934, Level 16, State 2, Line 1

    Table error: Object ID 517576882, index ID 1. The high key value on page (1:5667) (level 0) is not less than the low key value in the parent (0:1), slot 0 of the next page (1:11745).

    Server: Msg 8933, Level 16, State 1, Line 1

    Table error: Object ID 517576882, index ID 1. The low key value on page (1:11745) (level 0) is not >= the key value in the parent (1:5209) slot 21.

    DBCC results for 'tblABC'.

    There are 390047 rows in 9378 pages for object 'tblABC'.

    CHECKTABLE found 0 allocation errors and 2 consistency errors in table 'tblABC' (object ID 517576882).

    repair_rebuild

    --======================================

    The real issue is the duplicate on primary key and on unique index-----unbelieveable !!!

    Server: Msg 1505, Level 16, State 1, Line 1

    CREATE UNIQUE INDEX terminated because a duplicate key was found for index ID 1. Most significant primary key is '432446'.

    Server: Msg 8934, Level 16, State 1, Line 1

    To face the reality. I have to drop all the indexes, including the clusterindex for PK and another concatenated unique index for (eventID, eventTime). Only at this point, I could run a sql query (having count(*) > 1) to find the duplicate records ---- 100% duplicate on all fields.

    Finally, I could delete 41 duplicate records by create an identity field to distinguish them. The lesson I learned is there could be duplicates exceptions for anything including PrimaryKey field in SQL server.

  • We've had similar issues with clustered indexes getting "trashed", but still being able to get to the data.  Symptoms included high CPU and slow performance. I dropped the clustered index which, oddly enough, recovered the data just nicely.  When I recreated the clustered index, everything was fine. I have yet to get a really complete technical explanation as to why this happened.  All I know is, every time something like this has happened, it's soon after introducing a new storage device (SCSII drive array or SAN) in which the "Write" cache is enabled!.  I always turn off the write caching or set the buffer % to zero. SQL Server does not do well at all with write caching.  Just FYI.

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply