Illustration by Jackie Ferrentino

Silicon Valley’s Stonewalling

December 10, 2019

During the 2016 presidential race, Cambridge Analytica, a consulting firm, used data from up to eighty-seven million Facebook profiles to target voters on behalf of Donald Trump. An academic researcher had access to the data as part of a standard research arrangement, but then sold it, in defiance of Facebook’s rules. The breach caused a systemic shock: what had seemed like an innocuous research project allowed a firm to weaponize personal information for use in a propaganda campaign.

Over the past year, Facebook and Twitter have aimed to demonstrate their eagerness to examine everything that went wrong in and around the election. Both companies have announced a host of ambitious projects, empaneling teams of scholars to look at thorny problems like disinformation and “conversational health.” A number of researchers, however, say that the only tangible result of these endeavors has been the press releases. Some participants have given up after failing to get access to the information they need; one told me that working with Twitter was a nightmare. Facebook and Twitter, wary of exposing user data exactly as they did to Cambridge Analytica, have built a fortress. “I still can’t even get API access to the ad archive, and I put in a request a month ago,” Dr. Jennifer Stromer-Galley, a professor at Syracuse University, says of the application program interface for Facebook’s advertising database, which is often required for any meaningful analysis.

One factor is the European Union’s General Data Protection Regulation (GDPR), which came into effect in May 2018, requiring tech platforms to guarantee the privacy of their users and to acquire consent before sharing personal information. The GDPR doesn’t prevent companies from providing data that doesn’t expose a user’s identity, however, and Facebook, for one, has been more cautious with researchers than with advertisers. “I get that all the tech firms are gun-shy because of Cambridge Analytica and GDPR,” Stromer-Galley says, “but they’re giving data access to companies who are making money off Facebook, and yet as a researcher I’m not granted the same kind of access.”

Last spring, Twitter announced funding for research into improving “the collective health, openness, and civility of the dialogue on our service” and promised that participants would collaborate directly with Twitter’s team. The company said the idea was to produce “peer-reviewed, publicly available, open-access research articles and open source software whenever possible.” Of two hundred and thirty proposals, two research teams were chosen, one from the Netherlands and the other based at Oxford University. By March of this year, one of the teams, led by two Oxford professors, Dr. Miles Hewstone and John Gallacher, and Marc Heerdink, of the University of Amsterdam, dropped the project, unable to reach an agreement with Twitter about how to receive the data they needed. A Twitter spokesman confirmed to me that the remaining team, based at Leiden University, still hasn’t received any data, either. (The spokesman maintained that Twitter would “remain committed to working with outside researchers.”)

Twitter, it seems, didn’t have a process in place to provide the material it had promised. And there were innumerable restrictions: the company refused to provide personally identifiable information on users (understandable), but the researchers couldn’t even get anonymized data (exasperating). There’s no question that automating the release of personal information from vast networks is a potential legal and ethical minefield. But at this point, the projects at Twitter and Facebook are good only for PR.

And they might not even be so good for that. Take Social Science One, which was launched in July 2018 by Gary King, a quantitative social scientist at Harvard, and Nathaniel Persily, a professor at Stanford Law School who codirects the university’s Cyber Law Center. The aim of Social Science One was to work with Facebook to select and provide data for researchers to use. 

But after more than a year of discussion, and despite press releases trumpeting that researchers from across the globe had been selected to receive data, Facebook has, to this day, still provided only a small amount of relevant material. Facebook “overpromised and underdelivered,” Persily tells me. After threatening in August that they might shut the project down due to a lack of data, the funders of Social Science One—including the Democracy Fund, the Knight Foundation, and the Omidyar Network—said in October that it would continue, at least for now, but complained that the company had still failed to provide “access to the breadth and depth of data that funders, independent researchers, and Facebook’s own researchers originally hoped” to see. 

Mathew Ingram is CJR’s chief digital writer. Previously, he was a senior writer with Fortune magazine. He has written about the intersection between media and technology since the earliest days of the commercial internet. His writing has been published in the Washington Post and the Financial Times as well as by Reuters and Bloomberg.