Differential Privacy

There is an introduction to differential privacy you can read, though it gets a bit complex with math. Essentially, the idea here is to somehow share some data anonymously but prevent the reverse engineering problem Netflix had when they released their dataset in 2007.

It's an interesting idea, but I'm not sure this works well, even if you are limiting queries from some client that is doing machine learning or open research. The big problem here is that when you allow open access, even if you limit access for one user, how do you know that a different user isn't collecting all the data? After all, even if you were allowing only students to access your database, can you be sure that they aren't collaborating and sharing the results of their queries?

The vast amount of data available on many people can cause problems with ensuring privacy, especially from third party data sets. This is one reason that I often use random birthdays on different sites, because I worry that my data is going to be lost and correlated with other data breaches. It's not much, but it is something.

There are companies that seem to think that differential privacy and a little noise in datasets helps ensure privacy. I think they're naive, or perhaps disingenuous, but either way, privacy remains a serious problem in development work of all sorts. Especially when we find that the controls around development data are much poorer than production data. Even that (arguably) isn't well secured.

There aren't great solutions, but I do think that for most companies, they should have some sort of curated data set for development purposes. Fake data that allows developers to build software, but isn't likely to cause an issue for human if the data is exposed. Unfortunately, I'm not hopeful that any significant number of companies will actually go down this path.

Learn to Change

by Grant Fritchey

SQLServerCentral

This week Grant talks about the need for change and growth to adapt to the changing world.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2019-12-21

165 reads

Discuss

Is Data the Future of the Vibrant Web?

by Steve Jones

SQLServerCentral

Data Privacy and Protection

There is a lot of data collected from web browsing. Steve doesn't think we need all that data, unlike Google.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2019-09-24

171 reads

Discuss

Don't Get On This Page

by Steve Jones

SQLServerCentral

There is a page where GDPR fines are tracked. None of us want to get on that page.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2019-08-22

428 reads

Discuss

Demo Data for Everyone

by Steve Jones

SQLServerCentral

Steve thinks having a known set of data for your system is one way to improve your software development process and make salespeople happy.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

(1)

You rated this post out of 5. Change rating

2019-08-14

563 reads

Discuss

The Changing Nature of Data

by Steve Jones

SQLServerCentral

The way we look at data is changing, especially when data privacy and protection is considered. Today Steve has some thoughts on address data and the implications for cities as well as databases.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2025-04-11 (first published: 2019-07-16)

436 reads

Discuss

Differential Privacy

Rate

Share

Categories

Share

Rate

Differential Privacy

Rate

Share

Categories

Share

Rate

Related content

Learn to Change

Is Data the Future of the Vibrant Web?

Don't Get On This Page

Demo Data for Everyone

The Changing Nature of Data