• Sean Lange (5/19/2014)


    Eric M Russell (5/19/2014)


    Maxer (5/18/2014)


    A company I work with stores all HR data for all past and present employees in a table that has SSN, DOB, address, and full name.

    This sort of sets my "Spidey Sense" tingling all that personal data in one table where all of the IT dev team can get full access.

    I was wondering about some better options?

    1. Limit access (for development I was thinking swap out real SSN with fakes on dev and UAT)

    2. Encryption options? Column level inside SQL 2008 R2? Other options? This is a 3rd party ERD so I can't change the values in the column itself without some sort of abstraction voodoo happening?

    3. Any other thoughts?

    Generally speaking, the entire IT department, including development, doesn't need access to production in any capacity, so I'm assuming we're just talking about how to protect sensitive data in the development and UAT environment.

    The ETL process you use to refresh development and UAT, you can substitute actual SSN, First Name, Last Name with a hash.

    print HASHBYTES('MD5','111223333')

    0x3A6838DE381E20C401DB7629508DB352

    print cast(HASHBYTES('MD5','111223333') as varchar(9))

    :h8Þ8 Ä

    This works but doesn't address the legitimate concern that this type of sensitive data should NOT be stored in clear text even in production. This type of thing should be encrypted or hashed. Hashing it only when it gets ported to a dev/test environment is only part of the solution.

    Yes, in production, it's good to have symmetric key encryption on senstive columns, in addition to role based security and least privilege.

    However, symmetric key encryption requires varbinary datatype, setting up encryption key, and then using decrypt and re-cast functions within select statements at runtime, so it involves a significant amount of re-architecture. What I've done in the past is implement a view for each table that contains symmetric encrypted columns with the decrypted and recasted computed columns, so the application or BI doesn't have to mess with that part, they just have to open the symmetric key and select from the view as if it were the base table.

    For the dev and QA environment it's a lot more straightforward, just give them the hashed or encrypted columns and don't let them decrypt it.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho