Why We're Not Calling the Cambridge Analytica Story a 'Data Breach'

On Saturday, an investigation by The New York Times, the Guardian, and its sister publication The Observer revealed that the data analytics firm that helped the Donald Trump presidential campaign had harvested the Facebook data of more than 50 million people in an effort to profile users and eventually target them with political ads.

In 2014, a researcher collected the data through an app that asked users to take a personality test for academic research purposes. Around 270,000 people agreed to have their data collected through the test, which its creator, Aleksandr Kogan, defined as “a very standard vanilla Facebook app.” But thanks to Facebook’s terms of service and its API at the time, the app was also able to collect data of their friends. This gave the researcher, who later handed the data to Cambridge Analytica, the raw information of more than 50 million people, according to the reports, which were largely based on the account of a former Cambridge Analytica data scientist.

Videos by VICE

The Observer called it one of Facebook’s “biggest ever data breaches.” The Times only referred to the incident as a “breach” once, using the term “leak” throughout the rest of the article. We at Motherboard believe the use of the expression “data breach” in this case is incorrect and may be confusing to readers.

As the news spread and echoed online, several websites and other publications called it a data breach. Many security experts and researchers—and Facebook itself—believe this is the wrong expression to refer to what happened here.

“It is incorrect to call this a ‘breach’ under any reasonable definition of the term,” Facebook’s chief security officer Alex Stamos wrote in a deleted tweet.

Facebook’s vice president and deputy general counsel Paul Grewal wrote that “the claim that this is a data breach is completely false,” because the researcher who made the app obtained the data from “users who chose to sign up to his app, and everyone involved gave their consent.”

Got a tip? You can contact this reporter securely on Signal at +1 917 257 1382, OTR chat at lorenzo@jabber.ccc.de, or email lorenzo@motherboard.tv

Saying that “everyone involved” consented seems misleading, given that only around 270,000 out of the 50 million people who got their data harvested reportedly signed up for the app. The others probably had no idea this app even existed. And since Facebook changes its privacy settings so frequently, we also don’t know if the people who agreed to use the app fully understood what kind of data they were giving up. And no one at the time knew the data would later be handed out to a shadowy data analytics firm hired by the Trump campaign.

While we understand why some are describing the data Kogan handed to Cambridge Analytica as a breach, based on what’s been reported so far, we believe that describing this incident as a breach would, at least at the moment, mislead our readers.

We’ve been regularly covering data breaches for years. No one hacked into Facebook’s servers exploiting a bug, like hackers did when they stole the personal data of more than 140 million people from Equifax. No one tricked Facebook users into giving away their passwords and then stole their data, like Russian hackers did when they broke into the email accounts of John Podesta and others through phishing emails.

In 2014, when Kogan collected the data of 50 million people, he was playing by the rules. At the time, Facebook allowed third party apps to collect not only the data of the people who consented to giving it up, but also their friends’ data. The company later shut down this functionality.

Facebook says the data was misused because Kogan told Facebook he would use it only for academic research. But that might be the only anomalous thing about this case.

Facebook obviously doesn’t want the public to think it suffered a massive security breach, like Yahoo did in 2013 and 2014. We agree not because we want to minimize the significance of the Cambridge Analytica story, but because the real story is far more troubling: This data collection was par for the course. In other words, it was a feature, not a bug. And while the process that Kogan exploited is no longer allowed, Facebook still collects—and then sells—massive amounts of data on its users.

As Zeynep Tufekci, the author of Twitter And Tear Gas, put it, Facebook’s vehement defense that this was not a data breach is itself actually a damning statement of what’s wrong with Facebook, and Silicon Valley’s ad industry in general.

“If your business is building a massive surveillance machinery, the data will eventually be used & misused,” Tufekci, a University of North Carolina professor who studies the social impact of technology, wrote on Twitter. “There is no informed consent because it’s not possible to reasonably inform or consent.”

Facebook’s security team, Tufekci concluded, can’t mitigate the company’s business model, which is predicated on collecting as much of our data, and our friend’s data, as possible.

We can condemn the misuse of this data, and Facebook’s data collection practices, without calling it a data breach, a term that may confuse readers and distract them from what we believe is the real problem here: Silicon Valley giants have built massive data collection machines with almost no guardrails on how they are used.

Get six of our favorite Motherboard stories every day by signing up for our newsletter.