I'm working on a project where the source data (out of my control) uses social security numbers to uniquely identify people. I don't even want to import SSN's into my database (in my control), let alone include them on reports (as is currently happening). But I need some way to be able to correlate an SSN in the source data with a [whatever value I use] in my own data.
So, I could maybe do something like hash the SSN's during the ETL step and store the encoded values in my database. But then I wind up with a huge 64-character or 128-character string. That's fine behind the scenes, but I also need some reasonable way to display that value on reports. It would be nice if I could have some kind of YouTube-like format, 10 digits or so of human-readable characters.
Any thoughts on how to do a one-way hash of a SSN (fixed 9 digit length) that results in a short-ish value without totally compromising security?
Or is the only option to store the full 64/128-character hashed value and then map it to some other YouTube-like unique identifier that I'd have to create for each record using perhaps a homemade function or something? I'd prefer not to use an auto-incrementing number, because I know that common wisdom is not to display those to end-users ("why do I see #100 and #102 but not #101???"). SQL Server doesn't seem to have anything other than GUID (also too long) built in, so my function would have to create a random string of letters and numbers and make sure it has not already been used in the table. Seems annoying and kinda hoopty. 🙁
"If I had been drinking out of that toilet, I might have been killed."