Error 7929, Check statement aborted. Database contains deferred transactions.

  • Around 12:51 this morning I was paged because the mssql service had restarted unexpectedly.  I investigate and found what appear to be a memory issue in a backup process:
    SQLVDI: Loc=SignalAbort. Desc=Client initiates abort. ErrorCode=(0). Process=10756. Thread=24412. Client. Instance=SERVERNAME. VD=Global\//LOCATION/Database_Log_201808100052.safe_SQLVDIMemoryName_0.

    The mssql service came back online and everything appear to be okay until later in the day when a simple process ran to update flags on a table...at which I noticed blocking...I did a few standard things and checked the stats, indexes, etc and then noticed there was a lingering transaction from 12:51am

    Transaction information for database 'CORELIBRARY'.

    Oldest active transaction:
      SPID (server process ID): 518
      UID (user ID) : -1
      Name    : user_transaction
      LSN    : (365886:1284741:19)
      Start time  : Aug 10 2018 12:51:39:473AM
      SID    : 0x010500000000000515000000be043e32bb0ada1107e53b2b5f130000

    Replicated Transaction Information:
       Oldest distributed LSN  : (365887:3062532:5)
       Oldest non-distributed LSN : (365886:1284741:19)
    DBCC execution completed. If DBCC printed error messages, contact your system administrator.

    Any attempts to update the table or drop a non-clustered index causes blocking from a LCK_M_SCH_M wait type, and running any integrity check (CHECKDB or CHECK TABLE) results in the 7929 error.

    I've dug around and found the transaction in the transaction log and believe it to be related to a non-committed replication command that never got comitted to the subscriber, but am unable to resolve the issue.

    At this point, I don't know what else to do...
    My last report plan for this evening is:

    ALTER DATABASE CORELIBRARY SET EMERGENCY
    ALTER DATABASE CORELIBRARY SET SINGLE_USER

    DBCC CHECKTABLE ('achstlmtIntermediate') WITH NO_INFOMSGS, ALL_ERRORMSGS
    --(Hope there's nothing out of the ordinary here)
    DBCC CHECKDB ('corelibrary') WITH NO_INFOMSGS, ALL_ERRORMSGS
    --(Pray its just a bad non-clustered index so I can rebuild it and be done with it)

    ALTER DATABASE CORELIBRARY SET ONLINE
    ALTER DATABASE CORELIBRARY SET MULTI_USER

    If all goes well, enjoy Friday night...

    Has anyone ran into this before?  And other than a CHECKDB with allowing data loss, what steps did you take to resolve it?

    ______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience

  • How did you get on with this?
    😎

    Last time I ran into the 7929, I ended up restoring from the latest backups (5 min transaction log backup interval so not much of a risk).

  • We never did the maintenance (which is good because I'd like more time to research).
    I'm going to try and restore a copy of the database from after the service restart in efforts to test against it first.

    When I try to run an update against the table in question, it hangs and I immediately get a wait_type = LCK_M_U and a wait_resource = PAGE: 11:4:20851363. When I look at what's on the page:
    DBCC TRACEON(3604)
    GO
    DBCC PAGE(11,4,20851363)
    GO
    DBCC TRACEOFF(3604)
    GO
    I get:

    DBCC execution completed. If DBCC printed error messages, contact your system administrator.
    PAGE: (4:20851363)
    BUFFER:
    BUF @0x000000D9A4B6A680

    bpage = 0x000000D8E480C000    bhash = 0x0000000000000000    bpageno = (4:20851363)
    bdbid = 11         breferences = 0       bcputicks = 0
    bsampleCount = 0       bUse1 = 31922        bstat = 0x9
    blog = 0x7a7a7a7a       bnext = 0x0000000000000000   

    PAGE HEADER:
    Page @0x000000D8E480C000

    m_pageId = (4:20851363)     m_headerVersion = 1      m_type = 2
    m_typeFlagBits = 0x0      m_level = 0         m_flagBits = 0x200
    m_objId (AllocUnitId.idObj) = 2635 m_indexId (AllocUnitId.idInd) = 256
    Metadata: AllocUnitId = 72057594210615296          
    Metadata: PartitionId = 72057594194165760           Metadata: IndexId = 53
    Metadata: ObjectId = 933578364  m_prevPage = (4:20806697)    m_nextPage = (4:7397261)
    pminlen = 461        m_slotCnt = 13        m_freeCnt = 1309
    m_freeData = 7905       m_reservedCnt = 0       m_lsn = (365887:2468408:9)
    m_xactReserved = 0      m_xdesId = (1:-1848951422)    m_ghostRecCnt = 0
    m_tornBits = -1926036539    DB Frag ID = 1       

    Allocation Status
    GAM (4:20449280) = ALLOCATED   SGAM (4:20449281) = NOT ALLOCATED 
    PFS (4:20850864) = 0x40 ALLOCATED 0_PCT_FULL         DIFF (4:20449286) = CHANGED
    ML (4:20449287) = NOT MIN_LOGGED 

    DBCC execution completed. If DBCC printed error messages, contact your system administrator.
    DBCC execution completed. If DBCC printed error messages, contact your system administrator.

    Kind of stuck from here...

    ______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience

  • MyDoggieJessie - Saturday, August 11, 2018 9:09 AM

    We never did the maintenance (which is good because I'd like more time to research).
    I'm going to try and restore a copy of the database from after the service restart in efforts to test against it first.

    When I try to run an update against the table in question, it hangs and I immediately get a wait_type = LCK_M_U and a wait_resource = PAGE: 11:4:20851363. When I look at what's on the page:
    DBCC TRACEON(3604)
    GO
    DBCC PAGE(11,4,20851363)
    GO
    DBCC TRACEOFF(3604)
    GO
    I get:

    DBCC execution completed. If DBCC printed error messages, contact your system administrator.
    PAGE: (4:20851363)
    BUFFER:
    BUF @0x000000D9A4B6A680

    bpage = 0x000000D8E480C000    bhash = 0x0000000000000000    bpageno = (4:20851363)
    bdbid = 11         breferences = 0       bcputicks = 0
    bsampleCount = 0       bUse1 = 31922        bstat = 0x9
    blog = 0x7a7a7a7a       bnext = 0x0000000000000000   

    PAGE HEADER:
    Page @0x000000D8E480C000

    m_pageId = (4:20851363)     m_headerVersion = 1      m_type = 2
    m_typeFlagBits = 0x0      m_level = 0         m_flagBits = 0x200
    m_objId (AllocUnitId.idObj) = 2635 m_indexId (AllocUnitId.idInd) = 256
    Metadata: AllocUnitId = 72057594210615296          
    Metadata: PartitionId = 72057594194165760           Metadata: IndexId = 53
    Metadata: ObjectId = 933578364  m_prevPage = (4:20806697)    m_nextPage = (4:7397261)
    pminlen = 461        m_slotCnt = 13        m_freeCnt = 1309
    m_freeData = 7905       m_reservedCnt = 0       m_lsn = (365887:2468408:9)
    m_xactReserved = 0      m_xdesId = (1:-1848951422)    m_ghostRecCnt = 0
    m_tornBits = -1926036539    DB Frag ID = 1       

    Allocation Status
    GAM (4:20449280) = ALLOCATED   SGAM (4:20449281) = NOT ALLOCATED 
    PFS (4:20850864) = 0x40 ALLOCATED 0_PCT_FULL         DIFF (4:20449286) = CHANGED
    ML (4:20449287) = NOT MIN_LOGGED 

    DBCC execution completed. If DBCC printed error messages, contact your system administrator.
    DBCC execution completed. If DBCC printed error messages, contact your system administrator.

    Kind of stuck from here...

    What is the object with the id of 933578364 ?
    😎

  • Oh, that's the actual table name.
    I did try restoring a copy of the database at a point it time that was about 15 minutes after the database was brought back online (which completed successfully) - in checking that copy, the problem didn't exist.  I'm not sure if that's because the uncommitted transaction rolled forward or back, but couldn't find anything specific in the SQL Server Error log other than it had restored successfully.

    ______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience

  • Resolving this issue for me was simply a matter or placing the DB in EMERGENCY mode and bringing it back online:
    ALTER DATABASE PROBLEM_CHILD SET EMERGENCY
    GO

    Once it was in that state I ran a DBCC command against the table I found the issue with (which completed successfully), then brought it back online without issue (and noticed 2 transactions that were finally rolled back!

    The error log showed:

    Setting database option EMERGENCY to ON for database 'PROBLEM_CHILD'.
    DBCC CHECKTABLE ('AVeryBadTable') WITH all_errormsgs, no_infomsgs executed by MyDoggieJessie found 0 errors and repaired 0 errors. Elapsed time: 0 hours 2 minutes 14 seconds.
    Recovery of database 'PROBLEM_CHILD' (11) is 0% complete (approximately 45201 seconds remain).
    Recovery of database 'PROBLEM_CHILD' (11) is 20% complete (approximately 156 seconds remain).
    Recovery of database 'PROBLEM_CHILD' (11) is 30% complete (approximately 138 seconds remain).
       ....
    Recovery of database 'PROBLEM_CHILD' (11) is 90% complete (approximately 16 seconds remain).
    2 transactions rolled back in database 'PROBLEM_CHILD'
    Recovery is writing a checkpoint in database 'PROBLEM_CHILD'
    Recovery completed for database PROBLEM_CHILD(database ID 11) in 195 second(s) (analysis 5 ms, redo 177742 ms, undo 5986 ms.)

    Once that all completed everything returned to normal.

    I think i got very lucky....and I'll take that.

    ______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply