SQLServerCentral Article

A guide to recover a database out from Suspect mode

,

Editor's Note: Please read the discussion on this article as there are comments disagreeing with portions of it.

One of the worst situations I can imagine for a database professional is to get a call reporting a production database is in a "Suspect" state and the business cannot continue. This is a "code red" situation where DBA needs to bring the database online as soon as possible. In this article, I will formulate couple of steps which may be used as a high level process to handle this situation.

The first step is always to inform your customer about the outage before they come back to you. I find most of the times this proactive step becomes a life saver. In this way, you will not be questioned for the time you will spend to bring the database online.

Then refer to the SQL Server error log to find the root cause of the issue. It will give you a clear reason mentioning why the database is in Suspect mode. From SQL Server 2005 onwards, I recommend to use a filter while viewing the log and use "Database name" as the filter criteria. In this way you will only see the logs related to that particular database and then refer to the latest log/s for the root cause. Up until SQL Server 2000, the only option is reading the log from the latest entry backwards to find the root cause. Assuming you found the reason why database is in suspect mode, now you need to take appropriate step to fix the issue.

Here I will discuss some of the possible issues which can put a database in Suspect mode and recommend appropriate resolution.

Possibility 1: Some one deleted/misplaced a data/log file when SQL Server was offline and the database can not start because of missing file. The solution is to place the missing data/log file in proper location. The SQL Server Error Log error message will give you the exact name and path of the missing file. Once you place the file execute below command to bring your database online with no data loss.

RESTORE DATABASE WITH RECOVERY

Possibility 2: SQL Server could not access or place an exclusive lock on the data or log file while coming online. Typically I experience this when SQL Server is shared with some other tool (like Antivirus), which puts an exclusive lock on the data/log file. To resolve it, use process explorer and kill the file handler which placed lock on the file. You may want to involve your System Admins to get this step executed. Then execute below command and you will have your database online with no data loss:

RESTORE DATABASE WITH RECOVERY

Possibility 3: This is a worst case scenario. Database is in suspect because of a corrupted transaction. This is a bad news as you may have to lose data at this point unless you have a good backup! Also this is the most common case I saw for putting an OLTP database in Suspect mode.

The root cause of this issue is actually from a guarantee taken by SQL Server to ensure transaction consistency under fundamental ACID property of RDBMS. The root cause of this issue is most likely SQL server abruptly went down/restarted in the middle of a transaction and while coming back, SQL server could not complete (commit/rollback) the transaction.

At this point, I recommend you to take a decision. If you have a good backup and can restore the database in an acceptable time up to an acceptable point, then go ahead for it. But if restore is not an option at this point, then you have to execute below steps:

Caution! Below steps will cause you to lose data and hence are extremely dangerous to execute. I recommend trying all other possible options including calling Microsoft Support before executing below steps.

1. Switch the Emergency mode on for the database using below command:

ALTER DATABASE SET EMERGENCY;

2. Then execute below command:

dbcc checkdb ('',repair_allow_data_loss)

Please note that as stated above this is an extremely dangerous command to execute. It is a one- way command (that is you can not rollback back this execution once you started it) which can cause loss in data or database integrity. Technically, by executing this command you are actually authorizing SQL Server to force transactional recovery to run and skip the errors. If it faces errors, this operation scavenges as much out of the transaction log as it can and then rebuilds the transaction log from scratch. So taking this step is really the last resort you should try after every other attempts fails.

After this operation is complete you will have your database back online. However, you'll most likely have lost a bunch of data, broken constraints and inherent business logic in the database but at least you haven't lost everything. You may want to involve your customers to run a sanity check on the data quality at this point.

Possibility 4: If you find out that your data file is corrupted then most likely you have OS / Hardware level failure. For this these type of failure or anything really weird which you can not fixed easily with in acceptable amount of time, your best bet is restore from backup with out really wasting time in a "Code Red" situation.

So in this article, I covered some possible reasons which can put a database in "Suspect" and then the options to be back in business in shortest possible time. To conclude, I would like to add below basic homework tips so that you can be in a good shape while handling this situation:

1. Always ensure that you have a good backup/DR strategy and your customer is in agreement with the risk of possible data/time loss in case of disaster
2. Do not ever attempt to "detach" a database which is in Suspect. This will do nothing but you will probably loose your ability to repair it.
3. Read and understand the error log before taking any action and do not panic. Nothing in computer science is "unexplainable" so if you keep your calm and apply intelligence, you will come out of any situation with honor.

If you have any recommendation or suggestion or experience some situation where this article did not help you to resolve the situation, please drop me a note at info@consultdba.com .

Editor's Note: Please read the discussion on this article as there are comments disagreeing with portions of it.

 

 

Rate

2.98 (49)

You rated this post out of 5. Change rating

Share

Share

Rate

2.98 (49)

You rated this post out of 5. Change rating