PAGELATCH_UP Won't Let Go!

  • All,

    I ran into an issue where my backups and checkdb jobs are failing on one database.

    Result of a checkdb

    Msg 1823, Level 16, State 2, Line 1

    A database snapshot cannot be created because it failed to start.

    Msg 7928, Level 16, State 1, Line 1

    The database snapshot for online checks could not be created. Either the reason is given in a previous error or one of the underlying volumes does not support sparse files or alternate streams. Attempting to get exclusive access to run checks offline.

    Msg 5030, Level 16, State 12, Line 1

    The database could not be exclusively locked to perform the operation.

    Msg 7926, Level 16, State 1, Line 1

    Check statement aborted. The database could not be checked as a database snapshot could not be created and the database or table could not be locked. See Books Online for details of when this behavior is expected and what workarounds exist. Also see previous errors for more details.

    Msg 845, Level 17, State 1, Line 1

    Time-out occurred while waiting for buffer latch type 3 for page (1:2872334), database ID 25.

    sysprocesses shows spid 16 with a PAGELATCH_UP and a waitresource 25:1:2872334 and cmd of CHECKPOINT

    This makes me think there is a hung process out there somewhere with a latch on the specified page. Short of restarting the service, what other action should I be taking. I didn't want to jump right in and restart the service for fear of causing more harm than good.

    Let me know if providing more information would be helpful.

    ______________________________________________________________________________________________
    Forum posting etiquette.[/url] Get your answers faster.

  • This is on 2008 R2 SP2. I know there's a KB addressing a similar bug for 2008.

    ______________________________________________________________________________________________
    Forum posting etiquette.[/url] Get your answers faster.

  • It is interesting that when I run a backup command, the wait types switch.

    So normally spid 16 has the pagelatch_up and the wait resource is 1:2872334.

    When I run a backup (which fails with the same error) the backup spid has the pagelatch_up wait type and the wait resource is 1:2872334.

    Meanwhile, the spid 16 changes to lck_m_x wait type with a wait resource of spid <backupspid>.

    When the backup fails, spid 16 changes back to pagelatch_up as before.

    ______________________________________________________________________________________________
    Forum posting etiquette.[/url] Get your answers faster.

  • Here's what I did to resolve the issue.

    I set to single_user with rollback immediate which didn't resolve the issue. I couldn't switch back to multi_user because the system id was connected to the database. I couldn't take the database offline for the same reason.

    I ended up restarting the service and waiting for the database to finish recovering then switching back to multi_user. A backup was performed along with a checkdb. All seems well for now.

    ______________________________________________________________________________________________
    Forum posting etiquette.[/url] Get your answers faster.

  • calvo (6/10/2013)


    This is on 2008 R2 SP2. I know there's a KB addressing a similar bug for 2008.

    What's you're exact build #?

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • 10.50.4000

    ______________________________________________________________________________________________
    Forum posting etiquette.[/url] Get your answers faster.

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply