Tow Center

A loophole in the Right to Be Forgotten

July 26, 2016
Image: Alexei Kuznetsov (Flickr)

A new study suggests the European Union’s popular Right to Be Forgotten law—which allows EU residents to request that certain articles be delisted from Google and other search engines—may not prove effective, thanks to links that savvy searchers can uncover.

The European Union in 2014 instituted the Right to Be Forgotten (RTBF), which allows EU residents to request that certain articles be delisted from Google and other search engines on the grounds they are “inadequate, irrelevant, no longer relevant, or excessive.” The ruling originated in 2009 when Spanish lawyer Mario Conteja González requested that Google Spain delink two 1998 articles from a Spanish newspaper on foreclosure notices that showed up when his name was searched.

The ruling is intended to allow people to have some control over the Internet’s archive of them, but there are limitations. On their transparency page, Google lists four reasons that links may be delisted: clear absence of public interest, a catch-all for links that no longer exist and personal information like addresses and phone numbers; sensitive information, e.g., on a person’s religion or sexuality; content relating to minors; and spent convictions, exonerations, and acquittals for crimes. They won’t delist simply for duplicate links or broken links, or when there is a “strong public interest.” 

Beyond the ethical considerations around the ruling itself, there are also more practical issues surrounding its implementation and impact. Google has received almost half a million requests to delist 1.6 million URLs. Earlier this year, Google made a step toward closing one of the biggest loopholes, whereby results that did not show up when a name was searched on google.co.uk, for example, would still show up when searched on google.com in the EU. Now, while there are ways around this, they don’t show up as long as the searcher’s IP is in the EU.

 

It is possible for a hacker or transparency activist to rediscover delisted links and publish them, en masse.

Sign up for CJR's daily email

 

A new study, “The Right to be Forgotten in the Media: A Data-Driven Study,” presented at the Privacy Enhancing Technologies Symposium conference last week, demonstrates another loophole—one that hasn’t been taken advantage of yet, but which has the potential to undermine the RTBF. I spoke with one of the authors, Keith Ross, Dean of Engineering and Computer Science at NYU Shanghai, who told me that “The Right to Be Forgotten law, when it comes to links to online newspapers in particular, in the end it may not prove very effective at all.” It is possible for a hacker or transparency activist to rediscover delisted links and publish them, en masse, Ross told me. 

The study, conducted by a group of researchers out of China and Brazil, mimicks how, both the delisted links and the requesters can be determined. It works like this, Ross wrote:

First, the attacker targets a particular online newspaper, such as the Spanish newspaper El Mundo, and uses automated software tools to download articles that may be subject to delisting (such as articles about financial or sexual misconduct). Second, he again uses automated tools to get his computer to extract the names mentioned in the downloaded articles. Third, he runs a program to query google.es with each of those names, to see if the corresponding article is in the google.es search results or not. If not, then it is most certainly a RTBF delisted link, and the corresponding name is the person who requested the delisting.

The study then assesses whether delisting links has the opposite effect intended for any of their cases—that is, whether delisting the article is correlated with any spike in visibility. This is called the Streisand effect, which the study measures by looking at mentions in Google Trends and Twitter of the (now identified) requester’s name. While they don’t find much of an effect, it’s not hard to imagine a situation in which a “hacker goes out and publishes the links on his or her own website,” Ross said.

It should be noted that the requests to delist news articles are far outnumbered by the requests to delist social media and private information. The study reports independent research stating that “each of the eight sites for which Google receives the most requests are either social media or profiling sites, and 95 percent of the requests are for delisting of URLs pointing to private information”—that is, 95 percent of requests are not made on news media. Ross emphasized that this is not often understood—when people think about RTBF, they think about news sites.

Currently, Google notifies the host of a particular link whenever it gets delisted from the search engine, though it does not inform that host of who requested the link. Some of these sites, most prominently the BBC, publish a continually updated list of links that are no longer searchable by Google. They do this “as a matter of historic public record” and a “contribution to public policy.”It’s impossible to have a meaningful debate if you’ve not got an idea about what’s being delisted, the head of editorial policy David Jordan said.

Right now, the Right to Be Forgotten functions as a way to make it not impossible but much harder to discover certain information. But a recent ruling in Brussels threatens the very existence of that information by extending RTBF beyond Google to the news media as well. This ruling relies on the Belgian constitutions, not the EU. Is it not likely, Ross said, that news organizations will be required to take entire articles down. The ruling will probably ask that certain names within those articles be blacked out. This raises many concerns for the press, not the least how online news organizations will archive their stories if they are made to excise names of certain people. The press’s historical function, and freedom, could be threatened.

Many scholars have called on Google to be more transparent about this process, as a matter of public policy. Even though Google has published its rough guidelines, it hasn’t released aggregate data on what requests are being made, which are honored, and why. (What qualifies as “strong public interest,” for instance?) Ross, on the other hand, as a proponent of the RTBF, believes that (apart from the hole that his team discovered) Google is doing an adequate job of publishing its guidelines, and could not realistically share more without putting the data at risk.

The study ends by acknowledging that it is hard to see how to close the hacking loophole without broad censorship, thereby calling into question the efficacy of its very existence. This, coupled with the fact that the adequate functioning of the RTBF conflicts with press freedom, speaks both to the conflict between privacy and transparency, and to the necessity for data giants such as Google to become greater stewards of transparency.

This article has been updated.

About the Tow Center

The Tow Center for Digital Journalism at Columbia's Graduate School of Journalism, a partner of CJR, is a research center exploring the ways in which technology is changing journalism, its practice and its consumption — as we seek new ways to judge the reliability, standards, and credibility of information online.

View other Tow articles »

Visit Tow Center website »