Hiding Data Isn't Always Easy

There is an article about redacting data in a report done poorly. A consulting firm hired by Frontier Communications wrote a report and redacted lots of information. However, they apparently didn't do a good job as all the information they blacked out could be read if the data were copy and pasted elsewhere. Something that's much easier in digital reports than analog ones.

This was a PDF document, and I checked it. On page 25, there is this sentence: " Annual capital expenditures for Frontier’s West Virginia local exchange carrier companies have averaged over XXXXXXXX for the past nine years." The XXX is blacked out, but pasting it into a document shows this is a $70mm amount. It pays to know your tools.

This certainly isn't a good technique for hiding information, but it's not far off from what some people do when trying to mask or obfuscate sensitive production data in development environments. There are lots of cases where people use scripts that change data in one table, but not related data. Or that the changes are incomplete and don't do a good job of ensuring there isn't sensitive data leakage.

To be fair, this is a hard problem, and there are no perfect solutions. Anyone masking data likely needs to take a few passes at the problem, making adjustments over time to try and ensure that the data is protected from unauthorized disclosure. There also isn't a perfect solution, as many researchers have found ways to reconstruct the original data after it's been anonymized.

This is an area that I think is still somewhat immature, with relatively few best practices available for anyone to look at. While I have seen some guidance, I don't see much on how one could verify they had done a good job. I hope we find ways to do better in the future, with more knowledge that helps data professionals ensure they are doing a good job. Otherwise, we won't be able to protect data the way we want to protect it.

Learn to Change

by Grant Fritchey

SQLServerCentral

This week Grant talks about the need for change and growth to adapt to the changing world.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2019-12-21

176 reads

Discuss

Is Data the Future of the Vibrant Web?

by Steve Jones

SQLServerCentral

Data Privacy and Protection

There is a lot of data collected from web browsing. Steve doesn't think we need all that data, unlike Google.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2019-09-24

174 reads

Discuss

Don't Get On This Page

by Steve Jones

SQLServerCentral

There is a page where GDPR fines are tracked. None of us want to get on that page.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2019-08-22

436 reads

Discuss

Demo Data for Everyone

by Steve Jones

SQLServerCentral

Steve thinks having a known set of data for your system is one way to improve your software development process and make salespeople happy.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

(1)

You rated this post out of 5. Change rating

2019-08-14

590 reads

Discuss

The Changing Nature of Data

by Steve Jones

SQLServerCentral

The way we look at data is changing, especially when data privacy and protection is considered. Today Steve has some thoughts on address data and the implications for cities as well as databases.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2025-04-11 (first published: 2019-07-16)

442 reads

Discuss

Hiding Data Isn't Always Easy

Rate

Share

Categories

Share

Rate

Hiding Data Isn't Always Easy

Rate

Share

Categories

Share

Rate

Related content

Learn to Change

Is Data the Future of the Vibrant Web?

Don't Get On This Page

Demo Data for Everyone

The Changing Nature of Data