Bradley Ball is a MCITP SQL 2005 & MCTS 2008 DBA with over 10 years of IT experience. Bradley spent 8 years working as a Defense contractor for clients such as the U.S. Army and The Executive Office of the President of the United States. He is currently a Sr. Consultant for Pragmatic Works. He has presented at SQL Saturdays 62, 74, 79, 85, 86, 131, for the MAGICPASS & OPASS SSUG’s, SQL Rally 2011 & 2012, SQL Dev Connections 2012, the PASS Summit in 2011, and is scheduled for the PASS Summit 2012 and SQL Live 360 later this year. He recently finished Chapter 14 of Expert SQL Server Practices on Page & Row Compression and can be found blogging on http://www.sqlballs.com.
This one comes straight from the email bag. A friend recently had a problem, they were placing TDE on a database and the encryption scan had stopped at state 2 percent_complete 0. I'm bouncing around the Charlotte NC airport facing some plane delays, and I thought what better time than to play around with a little database corruption.
“So Balls”, you say, “What does TDE stuck in an encryption scan have to do with corruption.”
Great Question Dear Reader! The default Page_Verify setting in SQL Server is Checksum. This means when a page is read into memory and written back to disk a Checksum is calculated based off the pages contents. When it is written back to disk, that Checksum is there. When the page is read again the Checksum is used as a validation. If the Checksum fails then it tosses an error reporting the page as a Suspect Page.
Think of this like going through the TSA Checkpoint, you’ve got your ticket and your identification. If your ticket says ‘Serenity’, but your ID says ‘Zachary’ you will probably get flagged by the system as Suspect. In both cases that’s where the probing begins.
WE’VE GOT A PROBLEM
For this example I’m going to use a database that I’ve corrupted called CorruptAdventure taken from a corrupted version of AdventureWorksDW2008R2. Horrible name for a database, it was just asking for corrupting. We’ll start out assuming everything is fine. The powers that be want TDE, Transparent Data Encryption, enabled on the database and we will do that. First we’ll create our Master Key and a Database Certificate to use in the encryption.
Create Master Key
Create Master Key Encryption By Password='MasterKeyPass1'
Create Certificate DatabaseCertificate WithSubject='Dont Put Anything Importiant in the subject'
Now we’ll point to CorruptAdventure and create a Database Encryption Key and set encryption to on. Transparent Data Encryption will read each page into memory. If it doesn’t have a checksum one will get written. Our page has a checksum, but it’s contents have been corrupted. When SQL calculates a checksum to validate the current on, the page will get logged to the MSDB.dbo.Suspect_Pages table.
create database encryption key
with algorithm = aes_256
encryption byserver certificateDatabaseCertificate
Alter Database CorruptAdventure
Set Encryption on
It looks like it is encrypting!
Whoa! We hit our error.
Let’s query our Suspect_Pages table. Just like I thought we’ve got our database ID and our page ID. The error_type column is equal to 2, this means our page was flagged suspect during a Checksum operation.
It just stalled out
Why would this happen?
A page checksum occurs on all pages
whent the TDE scan
select * from msdb.dbo.suspect_pages
Now let’s run DBCC CheckDB and verify if we really have something wrong with our database.
DBCC CHECKDB(CorruptAdventure) WITH NO_INFOMSGS
Msg 8928, Level 16, State 1, Line 1
Object ID 325576198, index ID 5, partition ID 72057594043498496, alloc unit ID 72057594044940288 (type In-row data): Page (1:3874) could not be processed. See other errors for details.
Msg 8939, Level 16, State 98, Line 1
Table error: Object ID 325576198, index ID 5, partition ID 72057594043498496, alloc unit ID 72057594044940288 (type In-row data), page (1:3874). Test (IS_OFF (BUF_IOERR, pBUF->bstat)) failed. Values are 12716041 and -4.
Msg 8976, Level 16, State 1, Line 1
Table error: Object ID 325576198, index ID 5, partition ID 72057594043498496, alloc unit ID 72057594044940288 (type In-row data). Page (1:3874) was not seen in the scan although its parent (1:3888) and previous (1:3873) refer to it. Check any previous errors.
Msg 8978, Level 16, State 1, Line 1
Table error: Object ID 325576198, index ID 5, partition ID 72057594043498496, alloc unit ID 72057594044940288 (type In-row data). Page (1:3875) is missing a reference from previous page (1:3874). Possible chain linkage problem.
CHECKDB found 0 allocation errors and 4 consistency errors in table 'FactInternetSales' (object ID 325576198).
CHECKDB found 0 allocation errors and 4 consistency errors in database 'CorruptAdventure'.
repair_allow_data_loss is the minimum repair level for the errors found by DBCC CHECKDB (CorruptAdventure).
Just as I suspected corruption. We got the Database ID and page number from Suspect_Pages and DBCC CHECKDB just verified that the page is indeed corrupt. Now we can find exactly what type of data is corrupted, which will determine our strategy for handling it. We have the Object ID and Index ID for the DBCC CHECKDB Scan.
We can do a query against sys.indexes joined to sys.objects using the IndexID, 5, and ObjectId, 32576198, provided. We will get the table name, index name, and index type.
o.name as TableName
,i.name as IndexName
left join sys.objects o
Our corruption is on a non-clustered index. If you ever get corruption this is one of the easiest types to fix. We drop our non-clustered index and re-create it, and it should fix everything.
DROP INDEX IX_FactInternetSales_OrderDateKey ON dbo.FactInternetSales
CREATE NONCLUSTERED INDEX IX_FactInternetSales_OrderDateKeyON dbo.FactInternetSales
Now let’s Run DBCC CHECKDB to get a clean bill of health.
DBCC CHECKDB(CorruptAdventure) WITH NO_INFOMSGS
Excellent, looking at our TDE status it still hasn’t moved.
The TDE encryption scan should have paused when the Checksum error occurred. In case it didn’t you can manually pause the encryption scan and reset it with Trace Flag 5004. Turning Trace Flag 5004 on will stop the encryption scan right where it is. You then need to turn Trace Flag 5004 off so you can re-issue the encryption command and watch it commence. You might not need to use Trace Flag 5004, but I like to play this one on the safe side.
ALTER DATABASE CorruptAdventure
SET ENCRYPTION ON
Let’s check our encryption status.
We are progressing again, and it looks like now we’ve completed! Excellent, not only did we get our database encrypted but we were able to fix corruption that we were not previously aware of. One last peek at our TDE scan and we see it is complete and our encryption_status is now 3, no longer stuck at 2.
Well my plane just arrived, so that’s all for now Dear Reader, as always Thanks for stopping by.