How to check progress of database (in recovery)

  • thank you for your help. I will read over that blog post and hopefully that will give us some ideas so we can reduce the time it takes to recover next time.

  • This is a good query that gives %complete and helps estimate the finishing time also.

    SELECT r.session_id,r.command,CONVERT(NUMERIC(6,2),r.percent_complete)

    AS [Percent Complete],CONVERT(VARCHAR(20),DATEADD(ms,r.estimated_completion_time,GetDate()),20) AS [ETA Completion Time],

    CONVERT(NUMERIC(10,2),r.total_elapsed_time/1000.0/60.0) AS [Elapsed Min],

    CONVERT(NUMERIC(10,2),r.estimated_completion_time/1000.0/60.0) AS [ETA Min],

    CONVERT(NUMERIC(10,2),r.estimated_completion_time/1000.0/60.0/60.0) AS [ETA Hours],

    CONVERT(VARCHAR(1000),(SELECT SUBSTRING(text,r.statement_start_offset/2,

    CASE WHEN r.statement_end_offset = -1 THEN 1000 ELSE (r.statement_end_offset-r.statement_start_offset)/2 END)

    FROM sys.dm_exec_sql_text(sql_handle)))

    FROM sys.dm_exec_requests r WHERE command IN ('DB Startup')

  • Here is what I prefer to use:

    http://bit.ly/DBRecovery2

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • We had one of our biggest databases (>100 Tb) being suspect. Analysing the logs said it was caused by to some failing disk operation during a checkpoint. Which is kind of strange as we are using a SQL cluster with Storage Direct Spaces containing only NVMe disks.

    I knew the last backup contained consistency errors. As we restore every backup to another server and check it there. So restoring this one might introduce the same problem again.  So I ran DBCC CheckDB with REPAIR_ALLOW_DATA_LOSS. It took about 2,5 days to finish.

    A few days later I noticed that the log backups did not truncate the log files and had to add a new log file.  This happened over and over again. Log backup didn't do the job. Putting it into SIMPLE recovery model neither.  It kept on saying that there was an active transaction although dbcc opentran didn't give anything.  Closing all connections didn't do the job either.

    After 1 week the transaction logs grew to 10 TB. It was time to take another action.  So I backed it up, and then moved the SQL role to another node (which is more or less the same as restarting sql server service).

    All databases on the instance were up and running except the big database was now In Recovery.  There were only a few interesting items in the log:

    Recovery of database 'Keyboost' (19) is 0% complete (approximately 12 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.

    3434 transactions rolled forward in database 'Keyboost' (19:0). This is an informational message only. No user action is required.

    Except for a bunch of Deleting unrecoverable checkpoint table row  there seemed to be no activity except for logon failures.  Note that if you want to get rid of all logon failures in your sql log, use this script:

    DROP TABLE IF EXISTS #SQLErrorLog; --assuming SQL Server 2016+...otherwise use the typical if object exists drop

    CREATE TABLE #SQLErrorLog
    (
    LogDate DATETIME
    ,ProcessInfoNVARCHAR(12)
    ,LogText NVARCHAR(3999)
    )

    INSERT INTO #SQLErrorLog
    (
    LogDate
    ,ProcessInfo
    ,LogText
    )
    EXEC sp_readerrorlog;

    SELECT * FROM #SQLErrorLog
    where processinfo <>'logon'
    order by logdate desc

    None of the scripts above gave anything except this one:

    SELECT r.session_id,r.command,CONVERT(NUMERIC(6,2),r.percent_complete)
    AS [Percent Complete],CONVERT(VARCHAR(20),DATEADD(ms,r.estimated_completion_time,GetDate()),20) AS [ETA Completion Time],
    CONVERT(NUMERIC(10,2),r.total_elapsed_time/1000.0/60.0) AS [Elapsed Min],
    CONVERT(NUMERIC(10,2),r.estimated_completion_time/1000.0/60.0) AS [ETA Min],
    CONVERT(NUMERIC(10,2),r.estimated_completion_time/1000.0/60.0/60.0) AS [ETA Hours],
    CONVERT(VARCHAR(1000),(SELECT SUBSTRING(text,r.statement_start_offset/2,
    CASE WHEN r.statement_end_offset = -1 THEN 1000 ELSE (r.statement_end_offset-r.statement_start_offset)/2 END)

    FROM sys.dm_exec_sql_text(sql_handle)))
    FROM sys.dm_exec_requests r WHERE command IN ('DB Startup')

    The percentage completed staid at 3% for hours and hours, increasing the ETA from hours to days and even weeks.  Then I noticed what Gail Shaw said.  Going over the 10 Tb of log files would indeed take some time, even on our flash only cluster.  I went to see in the Disk Resource Monitor, normally on your sql server, but in our case it resides on the node where the SQL cluster share volume is located, and I noticed that the logs were being read at about 2 Gb/s.  Then I was reassured that at least it was doing something.

    After about 6 hours the next message appeared in the log stating that the recovery was completed.  Now, 4 hours later it is still clearing out the log files, but it is progressing.  When this is finished, I still need to remove all additional log files I created, switch to Full recovery mode and do a full backup again.

    Gregory Liénard
    Microsoft Data Engineer
    https://seopageoptimizer.com/: analyses billions of webpages
    https://keyboost.com/: analyses trillions of links

    If the statistics are boring, you’ve got the wrong numbers.

Viewing 4 posts - 16 through 19 (of 19 total)

You must be logged in to reply to this topic. Login to reply