Why ‘diffing’ could make news organizations more transparent

August 10, 2015
Photo: AP

In a front-page story last month, The New York Times cast Hillary Clinton as the subject of a criminal investigation. Within hours, key parts of the story started to fall apart. It turned out that Clinton was not the subject, and the investigation was not of a criminal nature. 

Matt Purdy, an editor who worked on the story, explained the mistake by punting it to the reporters’ unnamed sources. “We got it wrong because our sources got it wrong,” he told Margaret Sullivan, the paper’s public editor.

The initial story was published to the Web on a Thursday night and landed on the front-page of Friday’s paper. That day, the Justice Department issued a statement conflicting with the Times’ reporting. On Saturday, the word “criminal” was dropped from the headline and lede, and on Sunday, two corrections were published in the print edition of the paper.

What upset many readers was not just that the Times had been incorrect, but that the paper was slow to own up to it. The first substantial change to the story—that Clinton was not the target of the investigation—was made two hours after it was first published to the Web, but a correction was only appended on Friday afternoon, more than 12 hours later.

Sullivan was not impressed. “When mistakes inevitably happen, The Times needs to be much more transparent with readers about what is going on,” she wrote. “Just revising the story, and figuring out the corrections later, doesn’t cut it.”

An exact anatomy of how the story morphed can be found at NewsDiffs, a site that tracks the changes made to every article on the Web sites of The New York Times, The Washington Post, and several other news outlets. Each new version of a story can be compared to the previous version: additions are highlighted in green, deletions in pink. It’s a simple, visually appealing—and in the case of the Clinton story, dramatic—way to follow the evolution of a story.

Sign up for CJR's daily email

When Steve Rhodes, a Chicago-based reporter, saw the NewsDiffs for the story, he was blown away, he said. “It was such a dramatic rewriting of the story. I thought, ‘Readers should see this.’”

Rhodes tweeted to Sullivan, “Solution: with every iteration, make changes visible to the reader. In other words be ur own @NewsDiffs”

A brief exploration of NewsDiffs demonstrates that news outlets are routinely making changes after they publish, and not only copyediting changes. Most are fairly innocuous: sentence structure is changed, paragraphs are moved up or down, statements are added, quotes omitted, headlines are made more social media-friendly. But the potential for more substantive changes with a few keystrokes raises concerns about transparency.

Yet the digital medium that affords this mutability may also supply an antidote. What if news outlets tracked their own changes and made them available to the public on their sites, as Rhodes suggested?

When Arthur Brisbane was public editor at The Times from 2010 to 2012, he expressed support for a versioning feature like NewsDiffs, which he says would provide a documentary record and promote transparency and accountability. “You can append it to an explanation, an editor’s note, for clarity, for a forensic look at what happened,” he told CJR.

Brisbane explained that a more transparent process is necessary today than in the print era, when “you might have multiple editions, so you might have an opportunity to make a change once or twice. In the digital era it’s unlimited.” It’s not that changing published copy is wrong per se, says Brisbane, but when the changes raise questions, they should be explained. He told CJR that he would recommend that publications like The Times adopt a more selective process than NewsDiffs, because the volume of tracking every change could become cumbersome.

Versioning (also called “diffing”) is a fact of life in the world of computer programming. It allows programmers to revert back to earlier versions of their work, see changes made by collaborators, or understand the logic of their own code. Many other digital services—like Google Docs, Wikipedia, and GitHub, a popular code sharing site—offer diffing features. Some non-programmers use GitHub for writing because of the advantages of the “diffs” feature.

NewsDiffs was created by a three-person team in 36 hours at a Knight Mozilla MIT hackathon in 2012. One of its creators, Eric Price, then a graduate student at MIT, puts in a few hours of maintenance work every few months, but otherwise the site tracks and displays versions of news stories automatically, fueled by a simple open-source script available on GitHub, and hosted on MIT’s student servers.

Now an assistant professor in the University of Texas computer science department, Price says that the most egregious cases he’s seen is when news outlets rewrite an entire story at the same URL. In the lead-up to the 2012 presidential election, the Times published an article critical of Mitt Romney’s response during the Benghazi attacks. Two hours later, with news from Benghazi still developing, the Times replaced the article with a new, more forceful one on the same subject.

Scott Rosenberg, a co-founder of and longtime tech journalist, is another advocate for a versioning system for news outlets. In 2010, he created a WordPress plugin for just that purpose. It allows readers to view a story’s iterations, using WordPress’s built-in versioning system, by displaying a time stamp each time a story has been updated.

Rosenberg says that as long as third party sites like News Diffs exist, it looks like journalists have something to hide. “Right now, it’s a cat-and-mouse game,” says Rosenberg. When a news outlet makes a change without issuing a correction, “the framing of it is that journalists are hiding something and the public is finding out.”

Price echoed this sentiment, “Right now articles have a ‘gotcha’ tone. A writer will write about a change in The New York Times with an attitude of having caught them.”

As journalists, we subscribe to the logic that transparency is one way of ensuring increased accountability. So why not apply that to ourselves?

No one is suggesting that diffing would replace the corrections system already in place. Any substantive change that today requires a correction would continue to require one, because with diffing, there’s no easy way to tell if a change was the addition of a comma, the deletion of an entire paragraph, or of a single—but critical—word, like “criminal.”

Price would like to change that. He wants to make it easier to find significant changes among the thousands being made every moment. He’d also like to make NewsDiffs more useful to readers, perhaps by adding a browser extension that would allow users to display the highlighted diffs while reading stories on the Times website, for example.

In the rare event of a legal or ethical issue—say the wrong person is mistakenly named as a suspect in a crime story—news outlets would still have the opportunity to delete the record. But Price says that in the three years of NewsDiffs’ existence, he’s never come across a situation that required a deletion from the record.

Ultimately, there are two reasons for publications to at least consider adding diffing functionality. The first is that it would signal a fuller embrace of the Web’s ethos of transparency. The second is that journalists, after all, are human. The temptation to tinker with a story, especially if it has only been up for a few hours, is strong, as the Clinton story demonstrates.

Price says that’s something he’s seen on NewsDiffs. “You can find examples where they”—the news outlets—“wrote a factual change, but it was only online for a little bit, so then they didn’t bother with corrections.”

If even a few publications began diffing, it would at the very least force news outlets to ask whether they can withstand continuous (even retroactive) scrutiny. If the answer is no, then maybe the sausage-making of journalism should happen before it’s published to the web, not after.

Chava Gourarie is a freelance writer based in New York and a former CJR Delacorte Fellow. Follow her on Twitter at @ChavaRisa