When are people going to learn RAID1 is not a backup strategy !

  • Today I received a call stating

    both disks of our RAID1 crashed !

    We've sent the disks to a file recovery company, they managed to recover all the sqlserver db files.

    They attached them (no messages there), but now things doen't work!

    Euhm ... by the way .... we have no other backups ....

    As usual, you try to help out, runing dbcc checkdb and indeed it returns tons of inconsistencies.

    ( 3 of the 5 databases are'nt checkable at all, others complain about LOB pointers and missing links )

    So, you pass the message, data will be losst, if possible at all to get some data out of them.

    None of these databases are repairable.

    dbcc checkdb( db, repair_allow_dataloss) fails, even after the db has been set to emergency mode.

    So I advised them to contact MS PSS and see if they can help out.

    When are people going to learn RAID is not a backup strategy !

    You always need a backup strategy for your databases, your instances, your applications !

    It is your lifeline !

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • ALZDBA (9/23/2010)


    So I advised them to contact MS PSS and see if they can help out.

    Tell them not to waste their time/money. PSS is unlikely to be able to help. If you're curious, get the messages from CheckDB and post here and I'll tell you how far up the creek they are.

    When are people going to learn RAID is not a backup strategy !

    It's not? But, but, but, but...

    http://www.google.com/search?q=%22Raid+1%22+backup

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Since you asked for it 😉

    Msg 8928, Level 16, State 1, Line 1

    Object ID 53575229, index ID 1, partition ID 72057594039697408, alloc unit ID 72057594043695104 (type LOB data): Page (1:507) could not be processed. See other errors for details.

    Msg 8939, Level 16, State 98, Line 1

    Table error: Object ID 53575229, index ID 1, partition ID 72057594039697408, alloc unit ID 72057594043695104 (type LOB data), page (1:507). Test (IS_OFF (BUF_IOERR, pBUF->bstat)) failed. Values are 12716041 and -4.

    Msg 8965, Level 16, State 1, Line 1

    Table error: Object ID 53575229, index ID 1, partition ID 72057594039697408, alloc unit ID 72057594043695104 (type LOB data). The off-row data node at page (1:507), slot 0, text ID 153944064 is referenced by page (1:481), slot 11, but was not seen in the scan.

    another db

    Msg 8921, Level 16, State 1, Line 1

    Check terminated. A failure was detected while collecting facts. Possibly tempdb out of space or a system table is inconsistent. Check previous errors.

    Msg 824, Level 24, State 2, Line 1

    SQL Server detected a logical consistency-based I/O error: incorrect checksum (expected: 0x8c4444ea; actual: 0x8c6444ca). It occurred during a read of page (1:19170) in database ID 16 at offset 0x000000095c4000 in file 'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Data\Anotherdb.mdf:MSSQL_DBCC16'.

    a rerun after enlarging tempdb didn't solve anything.

    (I has exclusive access on this instance)

    I'm always glad to learn new stuff ....

    edited: euhm ... yes ... I know .... databases in c:\program files :crazy::sick:

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • The second one's irreparable. System table damaged (which one I can't tell). It may be possible to script and export to partially recover that one, depending which system table it is.

    Might be able to tell which object it is by checking other pages in that extent (or might not). See if you can run this and grab value for the Metadata: ObjectId

    DBCC TRACEON(3604)

    DBCC PAGE (16,1,19168)

    DBCC PAGE (16,1,19171)

    DBCC TRACEOFF(3604)

    Depends on the extent being dedicated not mixed.

    First one I can't see anything really bad, but that's not the entire output of CheckDB. If you can post the entire output....

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • use Anotherdb

    print db_id()

    go

    11

    dbcc checkdb ( Anotherdb )

    go

    Msg 8921, Level 16, State 1, Line 1

    Check terminated. A failure was detected while collecting facts. Possibly tempdb out of space or a system table is inconsistent. Check previous errors.

    Msg 824, Level 24, State 2, Line 1

    SQL Server detected a logical consistency-based I/O error:

    incorrect checksum (expected: 0x8c4444ea; actual: 0x8c6444ca).

    It occurred during a read of page (1:19170) in database ID 21 at offset 0x000000095c4000 in

    file 'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Data\Anotherdb.mdf:MSSQL_DBCC21'.

    Additional messages in the SQL Server error log or system event log may provide more detail.

    This is a severe

    error condition that threatens database integrity and must be corrected immediately.

    Complete a full database consistency check (DBCC CHECKDB).

    This error can be caused by many factors;

    for more information, see SQL Server Books Online.

    DBCC TRACEON(3604)

    -- It occurred during a read of page (1:19170) in database ID 21 at offset 0x000000095c4000 in

    DBCC PAGE (21,1,19168)

    DBCC PAGE (21,1,19171)

    -- using the db_id I printed in the first batch

    DBCC PAGE (11,1,19168)

    DBCC PAGE (11,1,19171)

    DBCC TRACEOFF(3604)

    DBCC execution completed. If DBCC printed error messages, contact your system administrator.

    Msg 2521, Level 16, State 10, Line 3

    Could not find database ID 21. The database ID either does not exist, or the database was dropped before a statement tried to use it. Verify if the database ID exists by querying the sys.databases catalog view.

    Msg 2521, Level 16, State 10, Line 4

    Could not find database ID 21. The database ID either does not exist, or the database was dropped before a statement tried to use it. Verify if the database ID exists by querying the sys.databases catalog view.

    PAGE: (1:19168)

    BUFFER:

    BUF @0x0000000092FD2380

    bpage = 0x000000009248E000 bhash = 0x0000000000000000 bpageno = (1:19168)

    bdbid = 11 breferences = 0 bUse1 = 61851

    bstat = 0xc00009 blog = 0x59ca2159 bnext = 0x0000000000000000

    PAGE HEADER:

    Page @0x000000009248E000

    m_pageId = (1:19168) m_headerVersion = 1 m_type = 1

    m_typeFlagBits = 0x4 m_level = 0 m_flagBits = 0x200

    m_objId (AllocUnitId.idObj) = 60 m_indexId (AllocUnitId.idInd) = 1 Metadata: AllocUnitId = 281474980642816

    Metadata: PartitionId = 281474980642816 Metadata: IndexId = 1

    Metadata: ObjectId = 60 m_prevPage = (1:26) m_nextPage = (1:19169)

    pminlen = 17 m_slotCnt = 2 m_freeCnt = 7566

    m_freeData = 7761 m_reservedCnt = 0 m_lsn = (485:26249:3)

    m_xactReserved = 0 m_xdesId = (0:0) m_ghostRecCnt = 0

    m_tornBits = 376799919

    Allocation Status

    GAM (1:2) = ALLOCATED SGAM (1:3) = NOT ALLOCATED

    PFS (1:16176) = 0x40 ALLOCATED 0_PCT_FULL DIFF (1:6) = CHANGED

    ML (1:7) = NOT MIN_LOGGED

    DBCC execution completed. If DBCC printed error messages, contact your system administrator.

    PAGE: (1:19171)

    BUFFER:

    BUF @0x0000000090FC9A00

    bpage = 0x0000000090268000 bhash = 0x0000000000000000 bpageno = (1:19171)

    bdbid = 11 breferences = 0 bUse1 = 61851

    bstat = 0xc00009 blog = 0x52152159 bnext = 0x0000000000000000

    PAGE HEADER:

    Page @0x0000000090268000

    m_pageId = (1:19171) m_headerVersion = 1 m_type = 1

    m_typeFlagBits = 0x4 m_level = 0 m_flagBits = 0x200

    m_objId (AllocUnitId.idObj) = 60 m_indexId (AllocUnitId.idInd) = 1 Metadata: AllocUnitId = 281474980642816

    Metadata: PartitionId = 281474980642816 Metadata: IndexId = 1

    Metadata: ObjectId = 60 m_prevPage = (1:11857) m_nextPage = (1:63456)

    pminlen = 17 m_slotCnt = 1 m_freeCnt = 7414

    m_freeData = 776 m_reservedCnt = 0 m_lsn = (492:27409:3)

    m_xactReserved = 0 m_xdesId = (0:0) m_ghostRecCnt = 0

    m_tornBits = 1719641107

    Allocation Status

    GAM (1:2) = ALLOCATED SGAM (1:3) = NOT ALLOCATED

    PFS (1:16176) = 0x40 ALLOCATED 0_PCT_FULL DIFF (1:6) = CHANGED

    ML (1:7) = NOT MIN_LOGGED

    DBCC execution completed. If DBCC printed error messages, contact your system administrator.

    DBCC execution completed. If DBCC printed error messages, contact your system administrator.

    object_name(60) ="sysobjvalues"

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • Am curious...

    DBCC Page(11,1,19170)

    Will probably give error

    I suspect that one's in the 'export data, script objects and hope for the best' category. Not repairable.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • that results into:

    PAGE: (1:19170)

    BUFFER:

    BUF @0x000000008FFF0400

    bpage = 0x000000008FC10000 bhash = 0x0000000000000000 bpageno = (1:19170)

    bdbid = 11 breferences = 0 bUse1 = 63869

    bstat = 0xc00809 blog = 0x59ca2159 bnext = 0x0000000000000000

    PAGE HEADER:

    Page @0x000000008FC10000

    m_pageId = (1:19170) m_headerVersion = 1 m_type = 1

    m_typeFlagBits = 0x4 m_level = 0 m_flagBits = 0x200

    m_objId (AllocUnitId.idObj) = 60 m_indexId (AllocUnitId.idInd) = 1 Metadata: AllocUnitId = 281474980642816

    Metadata: PartitionId = 281474980642816 Metadata: IndexId = 1

    Metadata: ObjectId = 60 m_prevPage = (1:19169) m_nextPage = (1:25)

    pminlen = 17 m_slotCnt = 10 m_freeCnt = 2994

    m_freeData = 5246 m_reservedCnt = 0 m_lsn = (380:4947:9)

    m_xactReserved = 0 m_xdesId = (0:0) m_ghostRecCnt = 0

    m_tornBits = -1941682966

    Allocation Status

    GAM (1:2) = ALLOCATED SGAM (1:3) = NOT ALLOCATED

    PFS (1:16176) = 0x40 ALLOCATED 0_PCT_FULL DIFF (1:6) = CHANGED

    ML (1:7) = NOT MIN_LOGGED

    DBCC execution completed. If DBCC printed error messages, contact your system administrator.

    DBCC execution completed. If DBCC printed error messages, contact your system administrator.

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • Interesting...

    Are these DBs still on the 'recovered' drive?

    It looks as though the error is in the snapshot that CheckDB creates, not the DB itself. Can you run a CheckDB with the Tablock option?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • use master

    dbcc checkdb ( theotherdb ) with tablock

    go

    Msg 8921, Level 16, State 1, Line 2

    Check terminated. A failure was detected while collecting facts. Possibly tempdb out of space or a system table is inconsistent. Check previous errors.

    Msg 824, Level 24, State 2, Line 2

    SQL Server detected a logical consistency-based I/O error: incorrect checksum (expected: 0x8c4444ea; actual: 0x8c6444ca). It occurred during a read of page (1:19170) in database ID 11 at offset 0x000000095c4000 in file 'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Data\theotherdb.mdf'. Additional messages in the SQL Server error log or system event log may provide more detail. This is a severe error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • ALZDBA (9/23/2010)

    ...When are people going to learn RAID is not a backup strategy !

    ...

    I think these people learned that:

    Death by Delete

    http://thedailywtf.com/Articles/Death-by-Delete.aspx

    "...Unfortunately, the single delete query proved to be far more than MegaPetCo could bear. Within a few months, the company filed for bankruptcy and was forced to close every one of its stores -- laying off several hundred people along the way..."

    Plenty of other people on this thread that learned things the hard way.

    It's dead, Jim

    http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=67782

  • very, very sad indeed

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • ALZDBA (9/23/2010)


    Msg 8921, Level 16, State 1, Line 2

    Check terminated. A failure was detected while collecting facts. Possibly tempdb out of space or a system table is inconsistent. Check previous errors.

    In that case, my previous diagnosis stands.

    Up creek, no paddle, boat sinking

    Try scripting objects and exporting data, if that fails, oops.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (9/23/2010)


    ALZDBA (9/23/2010)


    Msg 8921, Level 16, State 1, Line 2

    Check terminated. A failure was detected while collecting facts. Possibly tempdb out of space or a system table is inconsistent. Check previous errors.

    In that case, my previous diagnosis stands.

    Up creek, no paddle, boat sinking

    Try scripting objects and exporting data, if that fails, oops.

    Indeed, that's the only thing possible at this moment.

    Try to export as much as possible and take the loss.

    I hope they understand my advise to take regular sqlserver full and log backups fairly frequently.

    That would have restricted their current downtime to the time needed to replace the disks, install the server (c-drive was lost) and recover the db.

    That would have been a couple of hours, in stead of the current 4 days and counting.

    Thank you for your support and goodwill to doublecheck the dbcc checkdb results.

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply