• So what approach would you take? Would it be a script to mask the data post-restore? Or a table that is security-controlled with an overlayed view / proc that either masks the sensitive data on execute / query or decrypts the encrypted data on execute / query? Or would you use a 3rd party tool to generate duff data? Some other route?

    My first choice would be to restore the database while keeping users locked out, cleansing the data, then letting users in. Scripts to sanitize personally identifiable or sensitive data while maintaining the overall linkages between entities are tedious to write but are relatively easy, e.g. update this name to that name, scramble this number, rewrite that credit card number by pulling a random one from a set of check-digit-valid dummy card numbers, etc. are trivial and do not require intimate knowledge of every parent-child relationship.

    Second choice would be to use a data generator but I fear that would become even more tedious depending on how good the toolset is.

    My issue with something like Data Generator (I have the SQL Toolbelt) is that it is restricted, it's either a whole table or nothing which isn't what I want since that will result in inconsistent data. Additionally, it isn't set up to be all that automated - you need to build a project then run it from the command line, it would be amazing to be able to do something along the lines of "data generator"

    [column] [data type]

    Visual Studio has a data generator that may prove to be more flexible for you. I have used it to do unit testing but do not remember the details surrounding that specific nuance.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato