Database corruptions do not happen frequently, but when they happen, it is always a very troublesome and stressful event for DBAs. Many times corruptions are caused by hardware or subsystem issues. We, as DBAs, don't often have much to do in preventing corruptions, but we can try to detect them as soon as possible. The earlier we find it, the more options we have to fix the problems.
This article is not going to talk about how to fix corruptions. Instead, I will share several things DBAs can do to detect database corruptions. I will discuss the following four topics:
- Recovery - Page Verfiy setting
- Database Backup with checksum
- DBCC checkdb
- Database Alert
Recovery - Page Verify Option
The Page Verfiy option is set on each database. It can be set from SSMS or using a T-SQL script. This option tells the database how the page health is verified when a page is read or written.
To set this from SSMS, right click on a database, and choose properties. The Database Properties window opens. On the Options page, under the Recovery section, there is the Page Verify option. This is highlighted in the image below. Click the down arrow on the far right to set this.
To change the option using TSQL, use this code:
ALTER DATABASE <database name> SET PAGE_VERIFY CHECKSUM WITH NO_WAIT;
There are three options you can set for this options
- NONE does not page verification
- TORN_PAGE_DETECTION is an old option, and not the best choice if you are on SQL Server 2005 or above.
- CHECKSUM is the best one to choose.
If the database is created in a newer version of SQL Server, by default CHECKSUM is set. If the database was upgraded/migrated from an older version, be sure to manually change this option to CHECKSUM. Although it does not change all to checksum right way, database will gradually change page verification to CHECKSUM when read/write happens.
If a corrupted page is accessed, SQL Server will raise an error (823, 824, or 825). Be aware, if corrupted pages are not accessed, no error will be raised.
Backup with Checksum
Always run the backup database command with the checksum option. As described in Books Online: "Specifies that the backup operation verifies each page for checksum and torn page, if enabled and available, and generate a checksum for the entire backup." This means that SQL Server does the page checking during the backup process. If corruption is detected, the backup will fail. If the backup job is set to send notification when it fails, you are informed by email.
As an example, your backup command should look similar to this:
BACKUP DATABASE AdventureWorks TO DISK='X:\SQLServerBackups\AdventureWorks.bak' WITH CHECKSUM;
Actually, I like to set the backup job always send notification when the job finishes, no matter if it fails or succeeds. That way I am sure the job did run and is not for some reason.
This is the thorough way of checking the whole database. DBCC CHECKDB covers it all. This database check takes some time to run, depending on your database size. Usually it is scheduled to run during a maintenance window. This should be run either daily or weekly, depending on what your system can spare for resource capacity. It is recommended that this be run weekly or more frequently.
You run this as follows. Replace database_name with the name of your database.
When corruptions are detected, this job may succeed with information in the output or fail with an error message. I send the job output to a text file, and then let the job send the text file to me via email. Again, the job is set to notify me no matter the job fail or succeed.
DBAs should be alerted if any critical errors occur in a database. Glenn Berry has a good script to set these alerts up at this link: https://www.sqlskills.com/blogs/glenn/creating-sql-server-agent-alerts-for-critical-errors/
This check includes corruption errors as well, such as 823, 824, 825, etc.
If corruptions happen, get notified ASAP. Make sure you have good backups available all the time.
And good luck :).