What Data Should You Protect?

  • I'm somewhat skeptical that birth date, zip code, and gender are enough to uniquely identify 87% of Americans. That may be true in more rural areas, but not in cities, where a zip code may contain 40,000 or more people. More data on the study is needed.

    Having said that, I do protect my data by using false birth dates, email addresses that go nowhere, etc. If they don't need the data to ensure it's really me, and that usually applies ot a small fraction of online activity, they don't get it.

  • The short answer:

    In reality, it comes down to how many layers of security a company needs, and the perception of due diligence. That depends on the business model, and the perceived trust relationship with any players that can touch that model. Some strictly-business-related data may be far more sensitive to that business than personally identifiable data.

    Per Webster.com, diligence is "the attention and care legally expected or required of a person (as a party to a contract)".

    The Editorial:

    Security is a heart-warmer for those who desire to feel safe. However, using pseudonyms and such only serve to provide yet more dense information about a person. One cannot hide. That's life.

    The world, to the last, runs on trust. When trust is abused long enough it is lost, and the whole world suffers. Measures of security only truly succeed within trust relationships that can be positively identified, which is only possible in a face-to-face world. So how do we accomplish that when much of the digitial world is largely transacted impersonally and replicatably in binary sequences?

    Every form of digital security currently in place on this earth is still based in priviledge information that is negotiated on extremely loose trust principles, praying on the appearance of security, and the laziness of those who might otherwise be tempted to take and improperly use that which does not belong to them.

    True security implementations instead are more like signposts that guide most to proper behavior, much like a guard rail on a ship. It does not stop those who desire to leap over. One who blatently desires to break that trust will, given enough time and effort, succeed... for a time. But then at that point the balance shifts, and the very act blazes markers of identification that trail and haunt the individual indefinitely, growing larger in detail with every misdeed.

    The definition of security has turned in perverse ways from being an act to help those who choose to do good and avoid blunder, to that of the nearly impossible feat of keeping those who choose to do ill from so doing. The original intent of security is also the origin of laws and ordinances, to guide one safely on agreed and common ground, leaving all to act as free as possible, as long as those acts do not take away those rights from others without cause.

    The blame too has been shifting, from those that do ill to those that fail to prevent or stop the ill-wanton.

    So what does this mean to database professionals? Does this mean we ignore security? NO! Of course not. Of necessity we must continue to attempt to plug holes, guard access, obfuscate information both in transit and that which sits idle in the file, on supposed guarded servers, and guide right by preaching information security to the rest of the brigand of process and application from cradle to grave (if a grave even exists for all the data that is collected...).

    But sadly, it is like an anchor on humanity that grows heavier with each passing year, keeping us from reaching new and greater horizons in the sea of possibility. Some may argue, "That's job security." For myself, I would rather not give up grand potential for a little immediate security. But the path chosen is a course that humanity plots as a whole. The stall-point is fairly easy to identify: when we spend more time patching holes and locking down hatches than we do in forward movement, or to put it another way, more time spent building and maintaining fences instead of roads, we've passed the stall point, and are taking on water.

    To put it into DBA terms, if more DBA work is spent creating and maintaining security, whether by managing user/group access, encryption, storage, and even change controls and any other form of so-called security-related actions, than on refining, tuning, upgrading, advancing reportability, designing better architecture, and feeding the business that life-blood we call data and so on, then the company is either dead or dying, and usually only a buyout saves it at that point.

    On an entirely different angle, I don't believe most organizations will ever curtail information gathering actions... having worked within more than one marketing firm, "More! More! More!" is both the battle cry and the sales pitch to clients: it's called targeted marketing. And the information available is overwhelming, beginning from even before the day of birth, to well after one is gone. We can talk about wishing personally identifiable information is kept out, but it will fall on ears that went deaf long ago... circa 1971.

  • I agree with your concept wholeheartedly as it applies to the business world. I know in our company, when you refer to "clients" these are not individuals, these are large companies, but we still encrypt and 'hide' our client tables and use long customer ID's, a mix of letters and numbers to 'identify' clients with very few of our staff actually ever seeing, let alone working with the client detail information.

    I think when you are talking about individual users and the web in general, people are basically way too trusting or some might even say stupid. If you consider social networking sites I am still amazed by what people put out there in public. Whether its data itself or even pictures, people simply do not think about the consequences of what they post.

    But then there is the flip side as illustrated in this current law suit by the model in NY who is suing for slander based on what a formerly anonymous writer placed on a blog. The court ordered that the identity of the blogger be released. To me, that seems totally fair.

    So this is a door that swings both ways. Yes, personal information theft is a crime. But so is slander. Maybe what we need is a standardized way of creating a well-masked "handle" for our identities online, but also an extended freedom of information act when someone uses the web to slander and damage either an individual, or an organization.

    I don't actually see it happening, but its a nice thought.

    There's no such thing as dumb questions, only poorly thought-out answers...
  • James Stover (8/31/2009)


    C'mon, where are the "Mark of the Beast" comments??

    Really, nature has already given us a GUID. It's called DNA. That bit of technology has been around for billions of years. It's just that we have only recently (by comparison) learned to crack the code. I wouldn't doubt that future technology will allow our DNA to function like an RFID tag. A bit of electronica will "ping" our DNA which will uniquely resonate allowing instant identification. "But DNA doesn't work like that". Sure. So anyway, I believe it's inevitable that somewhere in the not-so-distant future, DNA collection will be mandatory at birth and we will all end up in a database somewhere (it could already be happening). So, the 10-digit thingy is really just an interim step towards that inevitability.

    I'm thinking...Gattaca meets 1984 🙂

    DNA is an invalid PK. Identical siblings.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • The point of the data about birthdate and Zip code isn't that knowing those allows someone to figure out who you are, it's that those data are adequate to operate as a PK for people in most circumstances.

    Except that it isn't immutable. People move.

    To the person who doubts due to density. Zip codes are limited in population size also. If an area becomes too dense they build a new post office and give it a new zip code. Let's say 40,000 people is a decent upper limit. 40,000 people ranging in age from 0 to 79 gives (365*80) = 29200 possible birth dates, add in gender and you have 58,400 possible keys for 40,000 people. So, sure, there's going to be some overlap in there but I think that 87% is probably about right.

    I recently read a similar article that indicated date of birth plus Home Zip code plus Work Zip code was even better.

    --

    JimFive

  • GSquared (9/1/2009)


    James Stover (8/31/2009)


    C'mon, where are the "Mark of the Beast" comments??

    Really, nature has already given us a GUID. It's called DNA. That bit of technology has been around for billions of years. It's just that we have only recently (by comparison) learned to crack the code. I wouldn't doubt that future technology will allow our DNA to function like an RFID tag. A bit of electronica will "ping" our DNA which will uniquely resonate allowing instant identification. "But DNA doesn't work like that". Sure. So anyway, I believe it's inevitable that somewhere in the not-so-distant future, DNA collection will be mandatory at birth and we will all end up in a database somewhere (it could already be happening). So, the 10-digit thingy is really just an interim step towards that inevitability.

    I'm thinking...Gattaca meets 1984 🙂

    DNA is an invalid PK. Identical siblings.

    Add a constraint. Mutate one of them. Problem solved...

    ---------------------------------------------------------
    How best to post your question[/url]
    How to post performance problems[/url]
    Tally Table:What it is and how it replaces a loop[/url]

    "stewsterl 80804 (10/16/2009)I guess when you stop and try to understand the solution provided you not only learn, but save yourself some headaches when you need to make any slight changes."

  • As for snitch,

    "Ahh, sweet relief in anonymity of common names.....":-D

    ---------------------------------------------------------
    How best to post your question[/url]
    How to post performance problems[/url]
    Tally Table:What it is and how it replaces a loop[/url]

    "stewsterl 80804 (10/16/2009)I guess when you stop and try to understand the solution provided you not only learn, but save yourself some headaches when you need to make any slight changes."

  • What data should you protect? All of it! Can be in done with various levels of protection but it all should be protected. Give a hacker an appetizer and they will surely be back for more. Make it a little difficult and they may move on to easier targets.

    That said... A hacker with a mission does not know the meaning of the moving on. In fact - The harder it is to get some data equates to a greater challenge that needs to be conquered.

    Life in the world of data.....

    Joe 😎

  • jcrawf02 (9/1/2009)


    As for snitch,

    "Ahh, sweet relief in anonymity of common names.....":-D

    True enough. I'd hate to be the person whose career depended on an accurate background check on "John Smith". On the other hand, "Steve Jones" isn't exactly an uncommon name, but snitch found data on him pretty readily. Still takes a human to sort through it for common threads, but algorithms are getting better all the time.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • jcrawf02 (9/1/2009)


    GSquared (9/1/2009)


    James Stover (8/31/2009)


    C'mon, where are the "Mark of the Beast" comments??

    Really, nature has already given us a GUID. It's called DNA. That bit of technology has been around for billions of years. It's just that we have only recently (by comparison) learned to crack the code. I wouldn't doubt that future technology will allow our DNA to function like an RFID tag. A bit of electronica will "ping" our DNA which will uniquely resonate allowing instant identification. "But DNA doesn't work like that". Sure. So anyway, I believe it's inevitable that somewhere in the not-so-distant future, DNA collection will be mandatory at birth and we will all end up in a database somewhere (it could already be happening). So, the 10-digit thingy is really just an interim step towards that inevitability.

    I'm thinking...Gattaca meets 1984 🙂

    DNA is an invalid PK. Identical siblings.

    Add a constraint. Mutate one of them. Problem solved...

    I can just see the ACLU's take on the idea of adding genetic markers to twins for identification purposes.

    Might be worth it just to tick some people off!

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • DNA is an invalid PK. Identical siblings.

    What's this...duplicate PK's? That never happens! Well, you know how to solve that one. Choose one and discard the others 🙂

    Of course, we have to account for clones, too. Can't forget the clones (also inevitable, IMO). Does my clone get the same 10-digit ID or a new one? I would have gotten away with it if it hadn't been for those meddling clones!


    James Stover, McDBA

Viewing 11 posts - 16 through 25 (of 25 total)

You must be logged in to reply to this topic. Login to reply