The team at Scripps Howard News Service didn’t use any tools that aren’t used in newsrooms across the country in the regular course of reporting. Isaac Wolf, a national reporter at Scripps News, was poking around Google (like you do). He searched for TerraCom—a company that provides federally subsidized phone service to low-income people—and Oklahoma City, where TerraCom is based. Clicking through the results, he came across a PDF that looked to be a completed application to join the subsidized phone program.
“To be honest, I didn’t think that much of it at first,” Wolf says. “When I scrolled through, I saw someone’s name and what appeared to be a social security number.” He’d come across sample applications before, and this looked like a standard document. The website he was looking at was run by a company called Call Centers India Inc., and it didn’t make sense to him, either, that he’d find a copy of an actual application on a website that wasn’t clearly associated with the government or with TerraCom.
“I didn’t quite believe it was real,” he says.
But he did a site-specific search and one that looked specifically for PDFs. And he knew that he had found something—hundreds of applications containing individuals’ identifying information, available to anyone who could search the Web. The Scripps team set about capturing those documents and used a bit of code to help download tens of thousands more. Those documents were key to Scripps’s investigation of Terracom, its affiliate YourTel America Inc., and the companies’ privacy practices.
The documents also provoked an unusual response—an accusation that the reporters were “hackers” and that they’d violated the Computer Fraud and Abuse Act, the law under which Aaron Swartz was prosecuted and which reformers say can too easily criminalize even the most routine Internet activity.
After Wolf requested an interview with TerraCom’s chief operating officer, Scripps received a letter from a lawyer representing TerraCom. “The person or persons using the Scripps IP addresses (the “Scripps Hackers”) have engaged in numerous violations of the Computer Fraud and Abuse Act,” the lawyer wrote. “I request that you take immediate steps to identify the Scripps Hackers, cause them to cease their activities described in this letter, and assist the Companies in mitigating the damage from the Scripps Hackers’ activities.”
The CFAA isn’t a law that journalists are taught to look out for. This may be the first time it’s been lobbed at reporters by the subject of an investigation — I couldn’t find anyone who’d heard of another example. But the law could potentially cover journalists’ activities, as reporters scour the Internet for documents and data.
For example, a Center for Public Integrity investigation brushed close enough to the law that the Center’s lawyers told reporters there that a prosecutor could argue that they had violated the CFAA by accessing documents using a password obtained from a source, as CJR has reported. (I worked at CPI for three years, with some of the reporters who conducted this investigation, but was not employed there while it was underway.) And the nonprofit Electronic Frontier Foundation has worked with security researchers who identified vulnerabilities in a company’s online security, tried to report them—as Scripps did when requesting an interview with TerraCom, before publishing its story—and ran into trouble.
EFF is also representing Andrew Auernheimer, also known as weev, the self-identified Internet troll who recently went to prison after being convicted of violating the CFAA. Like the Scripps team, Auernheimer and his codefendent, Daniel Spitler, were dealing with records—in this case, the email address of iPad owners, associated with a SIM card number—that were available on the Web to anyone who happened to find them. Spitler wrote a bit of simple code to download them en masse; Auernheimer distributed the email addresses to journalists; the government charged them both with conspiracy to violate the CFAA.
From a lawyer’s perspective, there’s not much difference between what Auernheimer and Spitler did and what Scripps did—both used fairly simple scraping techniques to acquire information that wasn’t protected by a password, firewall, or other security precaution. Spitler’s code generated random numbers in order to test out all possible SIM card identifiers; Wolf emphasized that Scripps didn’t use unscrambling or random number generation in its code. “Nothing we used was sophisticated or required guesswork or isn’t used in other newsrooms in the most basic capacity,” he says. A spokesman for TerraCom emphasized that only a few hundred of the documents Scripps accessed were available through Google, and that accessing the rest required messing around with the URL to find “non-public directories.”
But, as a matter of law, none of these distinctions matter much. “The real issue is: If information is publicly available on the Web, does accessing that information violate the CFAA?” says EFF staff attorney Hanni Fakhoury.
It’s a slippery enough issue that the behavior of the reporter, researcher, or troll accessing the information makes a big difference.
“Ultimately it comes down to the way you disclose the information. We’ve liked the idea that people should responsibly disclose and that they try to go to the company first to resolve the issue,” says EFF’s Fakhoury.
And any reporter using scraping should pay attention to how they approach the task.
“A smart reporter will get in touch—if it’s government data—will get in touch with the agency first,” says Steve Doig, the Knight Chair in Journalism at Arizona State University, who has consulted with a host of publications on computer-assisted reporting. The best approach may be to avoid the issue altogether, by asking for a copy of a database. “At least set up the script in a way that it doesn’t overload the server,” Doig says. “Have it run in the small hours of the night and have a reasonable rate of requests, so that you’re not doing what’s basically a denial of service attack.”
And that’s exactly why groups like EFF are pushing for reform to the CFAA. They argue that it’s so broadly written that violating terms of service like those could, in theory, land a person in prison for years. Operating in good faith and in the service of the public helps. But if a company or a government agency decided to go after a reporter for this type of document diving, it could.
Disclosure: CJR has received funding from the Motion Picture Association of America (MPAA) to cover intellectual-property issues, but the organization has no influence on the content.