Classifying Sensitive Data

  • Comments posted to this topic are about the item Classifying Sensitive Data

  • "they're" not "there". Sign of aging.

  • 1. Identify Sensitive vs. Really Sensitive vs. Burn Before Reading

    Sounds like common sense, but IT isn't the department who should classify what data can be seen and what can't and by whom. That job belongs to the stakeholders. Sometimes it's defined by statute and that makes it at least doable if IT is stuck with it, but I always stress to the C-level folks "unless you tell me otherwise, I'll assume this data can be posted on a billboard in Times Square for all the world to see."

    The sputters of indignation are awesome. :hehe: Followed by a fairly productive meeting where the C-level folks actually take an interest.

    2. Do we really need it?

    The second question I always ask is "why do we have to store this sooper-sekret data anyway? MUST we store the employee's social security number in their record? Why? We don't do payroll in house, why should we have it? Or credit card numbers? Or their mother's maiden name? (insert your particular sooper-sekret here).

    I think in our quest to squirrel away every bit of data like it's Fimbulwinter we forget there are things we probably SHOULD NOT STORE. Of course, convincing stakeholders of that can be exhausting...

    3. Isolate it.

    I think Steve's instinctive first reaction of "put it all in seperate tables" is both common-sensical and pragmatic. It's easier to protect, it's kept out of sight most of the time, and we don't need to worry about encrypting single columns and the headache of extra stored procedures (you do access tables only with SPs, right? :)).

    4. But we NEED it...

    Sigh. The real elephant in the room is when there's some kind of really sensitive data that needs to be used across a wide swath of the organization. I can't imagine a need to keep it as a column in the main table though. It's not that much overhead to isolate it in its own table (perhaps along with other sensitive data). The real headache is deciding who can see it, and making sure the SPs that can aren't "borrowed" for some non-sensitive report or purpose for convenience's sake. That way lies data breach!

  • Is data classification something that you work with often?
    How do you decide the data classes and does this impact your administration of the database?


    When it comes to defining what data elements should be classified and case usages for who, when and where the data should be accessed, then the business (not the DBA team) is responsible for gathering and documenting the requirements.The role of the DBA is coordinating with Data Governance and Compliance on the framework and best practices and then implementing it. Also, the DBA team should be responsible for, or at least collaborate on, an enterprise wide data dictionary, because you can't protect something if you don't know where all it exists.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • In my current state agency job we deal with a lot of HIPAA data. Kudos to the state department I work in for having identified that data and taking appropriate actions to identify, classify and secure that data.

    In my previous job it was much harder. We collected data from people with substance abuse addictions. Was that HIPAA or not? We were funded through the City and they were undecided through most of the time the organization existed. Sometimes they identified the data as being HIPAA compliant, other times they didn't. And that was the City's lawyers making those judgement calls. I appreciate your article Steve, this is not a simple thing to identify data as being one thing or another.

    Kindest Regards, Rod Connect with me on LinkedIn.

  • There was a time (not that long ago) when almost everyone's name, address and phone number were published in the phone book. It was not considered a big deal, in fact it was a convenience.

    But that was before Big Data. Things that were trivial before became just pieces in a puzzle that could be assembled by data analysis. Information could be cojoined with information from phone companies, cable companies, supermarket discount cards, google searches etc. to build a detailed picture of peoples' lives. By the 2012 presidential (according to an article in Technology Review) the DNC believed they had identified (via data analysis) virtually everyone who had voted for Obama in 2008. Information was also plotted from cable company records (which were theoretically anonymized--haha) to predict political leanings and sensitive issues based on TV and movie viewing habits. Phone messaging was tied to these associations, you wouldn't necessarily get the same script as your neighbor.

    By 2016, that's all old hat. The major players know more about you than your mother does.

    This poison runs deep, and sadly, we DB folks are part of the Faustian problem.

    ...

    -- FORTRAN manual for Xerox Computers --

  • erb2000 - Tuesday, June 27, 2017 4:36 AM

    "they're" not "there". Sign of aging.

    Thank you for pointing out the typo. Politeness might dictate that one message an author of a minor issue, but thanks for noting this publicly and reminding me of my age.

  • roger.plowman - Tuesday, June 27, 2017 6:38 AM

    1. Identify Sensitive vs. Really Sensitive vs. Burn Before Reading
    Sounds like common sense, but IT isn't the department who should classify what data can be seen and what can't and by whom. That job belongs to the stakeholders.
    2. Do we really need it?
    ...

    Re: 1, absolutely, but we in IT must handle the actual details of marking data as classified and dealing with it on a pragmatic basis (masking, encrypting, moving, isolating, etc.). Once someone makes decisions, we have to manage this data somehow, which implies some process. My thoughts are that we tend to do this very, very rarely.

    2. Often not. I am a fan of ensuring we capture all the data we need, even if we're unsure, but I also think that needs to be balanced in the privacy sense of not capturing secure information unless there is a need.

  • jay-h - Tuesday, June 27, 2017 9:06 AM

    There was a time (not that long ago) when almost everyone's name, address and phone number were published in the phone book. It was not considered a big deal, in fact it was a convenience.

    But that was before Big Data. Things that were trivial before became just pieces in a puzzle that could be assembled by data analysis.
    ...
    This poison runs deep, and sadly, we DB folks are part of the Faustian problem.

    I tend to agree. not to get into a political debate, but I do think your first few sentences give me pause to think. What we've always had a public record, because it seemed innocuous, is somehow viewed differently now when someone physically remote can assemble vast troves of data and glean information in a way that wasn't very practical before.

  • jay-h - Tuesday, June 27, 2017 9:06 AM

    ....
    Information could be cojoined with information from phone companies, cable companies, supermarket discount cards, google searches etc. to build a detailed picture of peoples' lives. By the 2012 presidential (according to an article in Technology Review) the DNC believed they had identified (via data analysis) virtually everyone who had voted for Obama in 2008.
    ...

    I don't know how much time and effort they put into all that, but they must have been chagrined to learn that actual voting records were available for purchase or leaked online. Based on the results of the 2016 (and 2017) elections, I'd say that hackers know more about the American populace than the DNC's big data consultants. 😎  :crying:

    "... First and last names. Recent addresses and phone numbers. Party affiliation. Voting history and demographics. A database of this information from 191 million voter records was posted online over the last week, the latest example of voter data becoming freely available ..."
    https://www.nytimes.com/2015/12/31/us/politics/voting-records-released-privacy-concerns.html

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Some internet service providers allow other customers to piggy back on your wifi router for use as a hotspot, which makes profiling individuals based on their personal browsing history problematic. There have even been complaints from folks receiving Copyright Alert warnings that there triggered by their neighbors web browsing and downloads.

    Your home Comcast Wi-Fi router is likely also part of a giant “public” hotspot network. Here’s how to turn that off.
    https://www.fastcompany.com/3039682/comcast-was-sued-for-quietly-making-your-homes-internet-part-of-the-sharing-economy

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Eric M Russell - Tuesday, June 27, 2017 11:14 AM

    Some internet service providers allow other customers to piggy back on your wifi router for use as a hotspot, which makes profiling individuals based on their personal browsing history problematic. There have even been complaints from folks receiving Copyright Alert warnings that there triggered by their neighbors web browsing and downloads.

    Your home Comcast Wi-Fi router is likely also part of a giant “public†hotspot network. Here’s how to turn that off.
    https://www.fastcompany.com/3039682/comcast-was-sued-for-quietly-making-your-homes-internet-part-of-the-sharing-economy

    That's one reason why I buy my own router and configure it myself.

    Kindest Regards, Rod Connect with me on LinkedIn.

  • Profiling individuals based on big data and analytics has potential, but it is limited. Obviously, you can use Google Maps or cell tower telemetry to predict they will probably be at the gym on Tuesday and Thursday evenings. You can also make some pretty good predictive analysis assumptions based on their spending habits.

    However, you can't reliably predict a specific individual's choice of vote, religious beliefs, sexual orientation, or criminal activity based on peripheral things like web searches or TV viewing. Two people viewing the same TV show or website can be thinking entirely different things. For example, I actually spend more time browsing news websites with a political orientation that is opposite to my own, because I find it interesting to hear opposing points of view.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Eric M Russell - Tuesday, June 27, 2017 11:14 AM

    ... There have even been complaints from folks receiving Copyright Alert warnings that there triggered by their neighbors web browsing and downloads.

    ...

    According to Comcast, the public facing and home facing signals are completely separate (which certainly would make sense from both technical and legal perspectives). I tend to be skeptical of those claims.

    ...

    -- FORTRAN manual for Xerox Computers --

  • Steve Jones - SSC Editor - Tuesday, June 27, 2017 9:58 AM

    jay-h - Tuesday, June 27, 2017 9:06 AM

    There was a time (not that long ago) when almost everyone's name, address and phone number were published in the phone book. It was not considered a big deal, in fact it was a convenience.

    But that was before Big Data. Things that were trivial before became just pieces in a puzzle that could be assembled by data analysis.
    ...
    This poison runs deep, and sadly, we DB folks are part of the Faustian problem.

    I tend to agree. not to get into a political debate, but I do think your first few sentences give me pause to think. What we've always had a public record, because it seemed innocuous, is somehow viewed differently now when someone physically remote can assemble vast troves of data and glean information in a way that wasn't very practical before.

    Yes.  It's really coming down to how much can you tie in (and how easy it is to get a hold of and use vast lists of that data).

    As for us: given the number of regs we are currently being subjected to, we've restarted a previously shelved initiative to create a "sensitive data vault": identifying areas where various classifications of data exist, essentially excising the extra sensitive stuff and storing it with additional encryption elsewhere, and changing the data access patterns to keep the confidential data out of the other data flows.  The non-confidential data entity will then be handed a GUID representing the placeholder for the confidential data.  Anyone accessing will have to be granted explicit access to the private keys used to decrypt it.

    After that - the real fun beings:  breaking apart the data access code to split off the sensitive access from the "public" (some of the regs we deal with will fail  us if we commingle the two).

    SQL's AlwaysEncrypted may work for some of our "easier" confidential data patterns, but others involve complex data entities/BLOBS where AlwaysEncrypted won't work.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply