• Sergiy (4/28/2016)


    http://opendata.stackexchange.com/questions/5595/how-should-us-ssn-be-anonimized

    Since SSN has only 9 digits, changing hash function will not suffice because attacker can simply apply the function to all 10^9 SSN's and match the result against the database.

    Any hacker would have a mapping table between all possible SSN's and corresponding hashes. If they don't at the moment - it's not so long to generate one using the method of your choice.

    So, you better use internally generated ID's, and for mapping use a table stored in another database with restricted access.

    May be even on another dedicated instance with specially designed security model.

    The following will build a dictionary table containing SSN and SHA2_512 hash for every possible social security number (~ 1 billion rows). I hacked this together in 1/2 hour so forgive any T-SQL coding practices that offend your sensibilities. However, it does works.

    Allow 200 GB for table, index, and temp space, and let it run overnight.

    Do NOT run this on production server!!!

    create table HashSSN

    (

    SSN char(11) not null

    constraint PK_HashSSN primary key

    with (data_compression = page, ignore_dup_key = on),

    SSN_SHA2_512 binary(64) not null

    );

    create unique index ix_SSN_SHA2_512

    on HashSSN ( SSN_SHA2_512 )

    with (data_compression = page);

    GO

    set nocount on;

    declare @i int = isnull((

    select cast(replace(max(SSN), '-', '') as int)

    from HashSSN

    ), 0);

    while @i < 999999999

    begin;

    -- print status message for each loop:

    raiserror ('@i = %i', 0, 1, @i) with nowait;

    insert into HashSSN (SSN, SSN_SHA2_512)

    select stuff(stuff(cast(SSN as char(11)), 4, 0, '-'), 7, 0, '-') as SSN, hashbytes('SHA2_512', SSN) as SSN_SHA2_512

    from (

    select right('000000000' + cast(N as varchar(9)), 9) as SSN

    from (

    select top 250000 @i + row_number() over (

    order by error, msglangid

    ) as n

    from master.sys.sysmessages

    ) T1

    where N <= 999999999

    ) as T2

    order by SSN;

    -- set database to simple recovery and manually checkpoint log here.

    checkpoint;

    select @i = @i + 250000;

    end;

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho