back to notes

Comments to The Economics of Surveillance


4:54 pm September 29, 2012
Jassa Skott wrote:

Can you elaborate on how this works: “data that once seemed anonymous can actually identify people if it’s pooled with other data sets.”
================================================

3:03 am September 30, 2012
Jennifer Valentino-DeVries wrote:

@Jassa Skott Certainly. Happy to do so.

In short: The more data sets you pool, the more likely it is that you will have information among them that can be linked together.

Let’s say you have data sets from four apps. App A has a smartphone ID number and your ZIP+4 code. App B has smartphone ID and your age. App C stores your smartphone ID, your political preference and a bunch of political messages you are writing under a generic username. And finally App D has this same username and your location twice a day, when you check it at home and at work.

There are not actually very many people who live in your ZIP+4 code and are your age and political affiliation. Even fewer have these characteristics and frequent your house and workplace. Someone can purchase public records data that has at least some of this information, provided you have done something like registering to vote. And then your “anonymous” information isn’t as anonymous.

This, of course, seems like a lot of work. And many companies that deal with this sort of data are in fact responsible and would not want to use it inappropriately. But I hope this shows that if a company or other entity buys a lot of data sets, interesting information can come to light.


last updated january 2014