best practices for storing ssn?

  • A company I work with stores all HR data for all past and present employees in a table that has SSN, DOB, address, and full name.

    This sort of sets my "Spidey Sense" tingling all that personal data in one table where all of the IT dev team can get full access.

    I was wondering about some better options?

    1. Limit access (for development I was thinking swap out real SSN with fakes on dev and UAT)

    2. Encryption options? Column level inside SQL 2008 R2? Other options? This is a 3rd party ERD so I can't change the values in the column itself without some sort of abstraction voodoo happening?

    3. Any other thoughts?

  • Hi

    Swaping out real SSN with fakes will possibly cause other issues.

    I used to do the same thing with scrambling the names, addresses and DOBs

    Encryption of SSNs is a good option too.

    Regards,

    Igor

    Igor Micev,My blog: www.igormicev.com

  • Maxer (5/18/2014)


    A company I work with stores all HR data for all past and present employees in a table that has SSN, DOB, address, and full name.

    This sort of sets my "Spidey Sense" tingling all that personal data in one table where all of the IT dev team can get full access.

    I was wondering about some better options?

    1. Limit access (for development I was thinking swap out real SSN with fakes on dev and UAT)

    2. Encryption options? Column level inside SQL 2008 R2? Other options? This is a 3rd party ERD so I can't change the values in the column itself without some sort of abstraction voodoo happening?

    3. Any other thoughts?

    Yes...

    First, thank you for the concern. I see too many people address the subject of SSNs and other senstivie information if far too cavailier a manner to be right.

    In regards to the 3rd party ERD software... If it were me, I'd give them the opportunity to make the necessary changes while looking for some other software that already gives SSNs the respect they deserve. If the original company fails to comply, report them to the Social Security Administration, the SEC, and the correct agencies for PCI and then drop them like a hot potato.

    As for having such information in Dev or Staging, you simply must not or it must be encrypted in such a fashion that no unauthorized person can "see" the original. The latter of those two is a tall order to comply with.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • How about creating seperate table for SSN and DOB having FK reference as empid? After that these two column can be encrypted. The purpose of creating this table is you can exlcude in rplication or after restore you can drop the table for security purpose etc.

    Also other concern is the accessing this data i.e. permission to access SSN & DOB should be properly evaluated.

    ---------------------------------------------------
    "Thare are only 10 types of people in the world:
    Those who understand binary, and those who don't."

  • Maxer (5/18/2014)


    A company I work with stores all HR data for all past and present employees in a table that has SSN, DOB, address, and full name.

    This sort of sets my "Spidey Sense" tingling all that personal data in one table where all of the IT dev team can get full access.

    I was wondering about some better options?

    1. Limit access (for development I was thinking swap out real SSN with fakes on dev and UAT)

    2. Encryption options? Column level inside SQL 2008 R2? Other options? This is a 3rd party ERD so I can't change the values in the column itself without some sort of abstraction voodoo happening?

    3. Any other thoughts?

    Generally speaking, the entire IT department, including development, doesn't need access to production in any capacity, so I'm assuming we're just talking about how to protect sensitive data in the development and UAT environment.

    The ETL process you use to refresh development and UAT, you can substitute actual SSN, First Name, Last Name with a hash.

    print HASHBYTES('MD5','111223333')

    0x3A6838DE381E20C401DB7629508DB352

    print cast(HASHBYTES('MD5','111223333') as varchar(9))

    :h8Þ8 Ä

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Eric M Russell (5/19/2014)


    Maxer (5/18/2014)


    A company I work with stores all HR data for all past and present employees in a table that has SSN, DOB, address, and full name.

    This sort of sets my "Spidey Sense" tingling all that personal data in one table where all of the IT dev team can get full access.

    I was wondering about some better options?

    1. Limit access (for development I was thinking swap out real SSN with fakes on dev and UAT)

    2. Encryption options? Column level inside SQL 2008 R2? Other options? This is a 3rd party ERD so I can't change the values in the column itself without some sort of abstraction voodoo happening?

    3. Any other thoughts?

    Generally speaking, the entire IT department, including development, doesn't need access to production in any capacity, so I'm assuming we're just talking about how to protect sensitive data in the development and UAT environment.

    The ETL process you use to refresh development and UAT, you can substitute actual SSN, First Name, Last Name with a hash.

    print HASHBYTES('MD5','111223333')

    0x3A6838DE381E20C401DB7629508DB352

    print cast(HASHBYTES('MD5','111223333') as varchar(9))

    :h8Þ8 Ä

    This works but doesn't address the legitimate concern that this type of sensitive data should NOT be stored in clear text even in production. This type of thing should be encrypted or hashed. Hashing it only when it gets ported to a dev/test environment is only part of the solution.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Sean Lange (5/19/2014)


    Eric M Russell (5/19/2014)


    Maxer (5/18/2014)


    A company I work with stores all HR data for all past and present employees in a table that has SSN, DOB, address, and full name.

    This sort of sets my "Spidey Sense" tingling all that personal data in one table where all of the IT dev team can get full access.

    I was wondering about some better options?

    1. Limit access (for development I was thinking swap out real SSN with fakes on dev and UAT)

    2. Encryption options? Column level inside SQL 2008 R2? Other options? This is a 3rd party ERD so I can't change the values in the column itself without some sort of abstraction voodoo happening?

    3. Any other thoughts?

    Generally speaking, the entire IT department, including development, doesn't need access to production in any capacity, so I'm assuming we're just talking about how to protect sensitive data in the development and UAT environment.

    The ETL process you use to refresh development and UAT, you can substitute actual SSN, First Name, Last Name with a hash.

    print HASHBYTES('MD5','111223333')

    0x3A6838DE381E20C401DB7629508DB352

    print cast(HASHBYTES('MD5','111223333') as varchar(9))

    :h8Þ8 Ä

    This works but doesn't address the legitimate concern that this type of sensitive data should NOT be stored in clear text even in production. This type of thing should be encrypted or hashed. Hashing it only when it gets ported to a dev/test environment is only part of the solution.

    Agreed, the sensitive data needs to be properly encrypted/hashed in production. Hashing the data prior to sending to dev /uat is a must. There is no good reason to have valid SSNs in a dev environment in clear text. Even having SSNs in Dev/uat that are not test SSNs is getting a bit too close to comfort for me.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • Sean Lange (5/19/2014)


    Eric M Russell (5/19/2014)


    Maxer (5/18/2014)


    A company I work with stores all HR data for all past and present employees in a table that has SSN, DOB, address, and full name.

    This sort of sets my "Spidey Sense" tingling all that personal data in one table where all of the IT dev team can get full access.

    I was wondering about some better options?

    1. Limit access (for development I was thinking swap out real SSN with fakes on dev and UAT)

    2. Encryption options? Column level inside SQL 2008 R2? Other options? This is a 3rd party ERD so I can't change the values in the column itself without some sort of abstraction voodoo happening?

    3. Any other thoughts?

    Generally speaking, the entire IT department, including development, doesn't need access to production in any capacity, so I'm assuming we're just talking about how to protect sensitive data in the development and UAT environment.

    The ETL process you use to refresh development and UAT, you can substitute actual SSN, First Name, Last Name with a hash.

    print HASHBYTES('MD5','111223333')

    0x3A6838DE381E20C401DB7629508DB352

    print cast(HASHBYTES('MD5','111223333') as varchar(9))

    :h8Þ8 Ä

    This works but doesn't address the legitimate concern that this type of sensitive data should NOT be stored in clear text even in production. This type of thing should be encrypted or hashed. Hashing it only when it gets ported to a dev/test environment is only part of the solution.

    Yes, in production, it's good to have symmetric key encryption on senstive columns, in addition to role based security and least privilege.

    However, symmetric key encryption requires varbinary datatype, setting up encryption key, and then using decrypt and re-cast functions within select statements at runtime, so it involves a significant amount of re-architecture. What I've done in the past is implement a view for each table that contains symmetric encrypted columns with the decrypted and recasted computed columns, so the application or BI doesn't have to mess with that part, they just have to open the symmetric key and select from the view as if it were the base table.

    For the dev and QA environment it's a lot more straightforward, just give them the hashed or encrypted columns and don't let them decrypt it.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply