The Uninteresting But Necessary Work

There has been a rising tide of data legislation in the last few years that asks for organizations, especially private companies, to better protect their data. The GDPR is one of the most well known, taking effect with regards to enforcement earlier this year, and I've been doing quite a bit of work in relation to this law. There are plenty of other laws, such as California's CCPA, Australia's NBD, Japan's APPI, and more that we ought to be aware of as data professionals. These laws affect personal data about people in a variety of ways, and they can affect how we process and use portions of the data we store.

It's not as simple as it might sound to change our data handling practices. In fact, it might not be that easy for many of us to do this now unless we've actually done something in advance: we need to have classified our data. We need to understand the impact of the various columns in our tables, the exports of flat files or reports, and even the development processes that make copies of our production databases.

I'll be honest, classification work is mind-numbingly boring and uninteresting. This almost feels like busy work to me, especially once we get past the obvious tax IDs and birthdays of people. When we start examining other data, the task feels like it ought to be delegated to junior staff, but many of them lack the experience to make the decisions. What can be more frustrating is that most of them lack the status to get others in the organization to respond to questions, which means the task ultimately falls on more senior people. This also means they often do it once and then forget it, leading to out of date information.

We don't like doing classification, but we need to do it. Without having some mechanism that allows us to determine if data can be moved or used in another system/database/report/etc., we end up just ping-ponging around. We assume all data is sensitive and try to lock it all down. That leads to complaints, as well as staff working to circumvent the rules until they appear meaningless. At this point we might give up on controlling data and just trust people. That leads to audit problems, potential data loss from security incidents, and plenty of embarrassment about why we didn't implement some simple controls.

Then the cycle starts again.

Classifying data is simple in some ways, but not easy to ensure the data is available, up to date, and easy to find for any size team. I've seen simple solutions that rely on spreadsheets. I've seen complex software packages that are expensive and cumbersome to implement with other applications. Microsoft has started to help with a few changes in SSMS, but this doesn't seem like a long term solution, though SQL Server 2019 might help. Redgate has spent some time on this as well, thinking about the issue and we have an early access program now. All of these are partial solutions that might work for some organizations, but not all.

Ultimately, this is something like security, that we ought to be building into our systems from day one. Every proof-of-concept or prototype ought to be classifying data from the beginning. We won't be perfect, and won't get every label correct, but if we're always thinking about the data, we can always correct our label and more tightly or loosely decide to handle data. I'd also like to think that if we conservatively label the data early, we're unlikely to get into positions where we are mishandling data in a way that makes it more likely that we accidentally lose data.

Contract or Perm

by Steve Jones

SQLServerCentral.com

Editorial

If you are accepting a DBA position, does it make sense to work as a contractor or permanent employee?

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2007-11-21

242 reads

Discuss

Mini-Me

by Steve Jones

SQLServerCentral.com

Editorial

Will the next version of Windows be a "Mini-Me" version of Vista? Who knows, and it's too early to tell, but apparently there's a mini-kernel version of Windows 7, the one after Vista, which fits into 25MB on disk. That's a touch lower than the 4GB that Vista takes up. Granted it's not a full […]

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2007-10-25

141 reads

Discuss

An Hour in Time

by Steve Jones

SQLServerCentral.com

Editorial

Daylight Savings time switches a little later this year. In fact it's November 4th this year, after having been in October for all of my life. In case you don't remember which way we move the clocks, here's a saying: Spring forward, fall back.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

(1)

You rated this post out of 5. Change rating

2007-10-17

404 reads

Discuss

Software is Like Building a House

by Steve Jones

SQLServerCentral.com

Editorial

One of the really classic analogies in software is that it's like building a house. You have a foundation, multiple teams, lots of contractors that specialize in something, etc. And it's an analogy that's debated as to its relevance over and over. I won't go into the correctness of this analogy, but I wanted to comment on it.