TDE - Encryption on a large database

  • I'm in the process of researching TDE implementation on a VLDB and would like to know if anyone has real world experience with enabling encryption on a production server.

    What kind of problems (if any) can I expect? My testing shows very little impact in a test environment, but I'm concerned about how long this is going to take to setup in prod. Any feedback is appreciated.

  • We use TDE with an HSM for several multi-TB highly-transactional databases.

    Once you set up and enable the encryption, it will run silently in the background encrypting the database while normal transaction activity proceeds. While the initial encryption is running, you can check the percent_complete column in sys.dm_database_encryption_keys to see progress. Encrypting a database requires all data and indexes to be read from disk, encrypted, and written back to disk, so there will be an I/O and CPU hit during the initial encryption. Unless your system is currently struggling, you should be OK. DMVs and DMFs don't change at all against a TDE database, so you can continue to monitor waits and I/O rates to measure any impact against production activity.

    Sure, there's a light perf hit to TDE, but if you want to encrypt your data at rest, then you're going to take a hit somewhere.

    The critical aspects to TDE revolve around protecting your key (including separate backups of the encryption key and protection steps for it), and being able to restore and recover that database and its data somewhere else. Just like any other DR plan, restoring TDE databases is something you must practice before the disaster occurs. You do not want to find out after a database corruption event that you weren't backing up the TDE key correctly.

    Additionally, you must keep an eye on the transaction log of a TDE database. VLF sizing affects TDE database recovery (database startup time) during an instance restart more than on non-TDE databases. During start-up crash recovery, entire VLFs must be decrypted even if there is only one transaction in the VLF that must be scanned for recovery. Many tiny VLFs always slow things down, but huge VLFs can seriously slow down TDE database startup times. Here's an MSDN blog post that is helpful on the VLF stuff, although it was not specifically written for TDE use:

    https://blogs.msdn.microsoft.com/blogdoezequiel/2011/05/24/sql-swiss-army-knife-9-fixing-vlfs/

    -Eddie

    Eddie Wuerch
    MCM: SQL

  • Thanks Eddie! This is great feedback! I especially appreciate the info about VLFs. That's the first time that I've seen that mentioned during my research.

  • Watch the log during TDE encryption as well. Logs written during this process.

    Great link, Eddie. Hadn't seen that either.

  • Also, once you TDE any database on an instance, then tempdb also automatically becomes a TDE database to prevent any TDE-encrypted data from being written unencrypted to a tempdb file. You may also experience a CPU hit (usually small) on heavy tempdb usage as well, as tempdb transaction log writes and data file writes must be encrypted.

    Eddie Wuerch
    MCM: SQL

  • I read somewhere that I'll need to disable Backups and database maintenance during the encryption process. Is this true?

    This is very concerning, since my testing took ~4 days to complete the encryption.

  • brickpack (7/7/2016)


    I read somewhere that I'll need to disable Backups and database maintenance during the encryption process. Is this true?

    This is very concerning, since my testing took ~4 days to complete the encryption.

    That's interesting. I've never heard of this. I'd have to test, which I did. It doesn't appear that the backup is blocked, but this was a quick test.

    I wouldn't think maintenance would be blocked, index rebuilds, etc., as these are operations similar to DML, and they just write encrypted data back to the database. I suppose they will slow things down.

    Backups made while the encrpytion is in progress could potentially have some encrypted and some unencrypted pages. In this case, I think once encryption is complete, you'd want a new set of backups. Once you are sure you are outside your RPO and have a couple good, encrypted backups, I might make sure I delete/overwrite any potentially unencrypted backups.

  • Steve... I was very surprised as well. Here's where I found it.

    https://msdn.microsoft.com/en-us/library/bb934049(v=sql.105).aspx

    Look at the restrictions section. Maybe I'm misinterpreting what that's saying.

    Thanks

  • OK, so when I read that, I think I must have either had encryption go quicker than expected, or missed a block somewhere.

    I'd read that as no backups during encryption, so that's an issue. Will it really take 4 days for you to encrypt a db? Or is that on a lower level of hardware.

    I read that you can't start encryption while maintenance is running, but I don't see that restricted while it's running.

  • Eddie Wuerch (6/9/2016)


    We use TDE with an HSM for several multi-TB highly-transactional databases.

    Once you set up and enable the encryption, it will run silently in the background encrypting the database while normal transaction activity proceeds. While the initial encryption is running, you can check the percent_complete column in sys.dm_database_encryption_keys to see progress. Encrypting a database requires all data and indexes to be read from disk, encrypted, and written back to disk, so there will be an I/O and CPU hit during the initial encryption. Unless your system is currently struggling, you should be OK. DMVs and DMFs don't change at all against a TDE database, so you can continue to monitor waits and I/O rates to measure any impact against production activity.

    Sure, there's a light perf hit to TDE, but if you want to encrypt your data at rest, then you're going to take a hit somewhere.

    The critical aspects to TDE revolve around protecting your key (including separate backups of the encryption key and protection steps for it), and being able to restore and recover that database and its data somewhere else. Just like any other DR plan, restoring TDE databases is something you must practice before the disaster occurs. You do not want to find out after a database corruption event that you weren't backing up the TDE key correctly.

    Additionally, you must keep an eye on the transaction log of a TDE database. VLF sizing affects TDE database recovery (database startup time) during an instance restart more than on non-TDE databases. During start-up crash recovery, entire VLFs must be decrypted even if there is only one transaction in the VLF that must be scanned for recovery. Many tiny VLFs always slow things down, but huge VLFs can seriously slow down TDE database startup times. Here's an MSDN blog post that is helpful on the VLF stuff, although it was not specifically written for TDE use:

    https://blogs.msdn.microsoft.com/blogdoezequiel/2011/05/24/sql-swiss-army-knife-9-fixing-vlfs/

    -Eddie

    This is great info! thanks.

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply