Tow Center

Boosting local news with data journalism and automation

January 31, 2019
Photo: Adobe Stock

The most recent wave of layoffs hitting the news industry offers a harsh reminder: digital media still hasn’t found a sustainable economic equilibrium. The future of local journalism looks particularly bleak, with some communities now considered “news deserts” as local papers disappear or are reduced to husks of their former selves. But amid these declines a new concept is being tested: data journalism, along with some cleverly applied automation, could be part of the antidote.

The RADAR (Reporters And Data And Robots) project was born out of a collaboration between the UK Press Association and a startup called Urbs Media, based in London. With a team of just five data reporters (and two editors) it produces an average of about 8,000 local stories per month across the UK. Its stories are run by various local media outlets that subscribe to a wire service it provides.

To produce its localized stories, RADAR leverages freely available open government datasets that are tabulated by geographic area (the granularity varies by dataset but is typically around 200 areas). Each reporter develops about two stories per week into data-driven templates which include fragments of text and logical if-then-else rules for how to translate the data into location specific text. The core structure of the stories might be somewhat similar across versions, but the details will be locally tailored. “Department for Transport data shows 18 people were killed and 162 people seriously injured on Shropshire’s roads in 2017,” reads the lede of one story adapted to the county of Shropshire in Western England.

The data journalists at RADAR are tasked with first figuring out various angles and storylines for the data. Then they do reporting to add broad-strokes background information and national context, which are written into a template with a basic story structure. Automation is then used as a “production assistant” to adapt some text in the templates to the local level.

A single data journalist can produce about 200 regionally specific stories for each of the two templates they write every week.

ICYMI: A Guide to Blockchain in Journalism

Sign up for CJR's daily email

Local outlets sometimes run the stories as-is and other times treat them more like leads, adapting them further to enhance the local relevance. Gary Rogers, the editor in chief of RADAR, told me that an audit of their content generated during October 2018 showed that print publications will rework about half of the stories, whereas online publications typically rework maybe one in five.

Martin Wright, the editor in chief of the Midlands News Association, explains the different use-cases. “We had some figures from RADAR that a quarter of all calls to the NHS (the National Health Service) helpline end with people going to accident emergency services, which is a big issue for us in Shropshire,” he says. “We then used that. . . to do a page inside the publication looking at. . . the wider issue of why the NHS is under such pressure.”

While the raw RADAR copy includes some national context, it’s often helpful to get sources who can offer a local reaction and make the story more relevant to the community. That’s also a chance to context-check the data against what people are saying on the ground in that locality. Of course this means there needs to be a local reporter ready to jump on the story and do some more work on it. But that extra work helps make the treatment of the story, and its copy, more unique and original—a competitive differentiator.

Tim Robinson at JPI Media, a news publishing chain also  receiving the stories, says the project has been of help to his newsrooms: “It enables us to cover subjects which we potentially wouldn’t be able to cover at all, or certainly in volume,” he tells CJR. JPI, however, publishes 95 percent of RADAR’s stories as provided, with only minor tweaks to the copy and with geographically resonant headlines added. Their online sites have already seen a boost in the number of page views since they’ve started mixing in the material from RADAR.

A key to the success of the RADAR process is that human journalists determine what angles, trends, or outliers are newsworthy in the data and structure the template for the various versions of the article. The automation then helps to locally adapt the writing, and if needed a local journalist can add to it to improve the local relevance.

READ: Emily Bell: Facebook should treat the cause, not the symptoms, of journalism’s plight

The government datasets used by RADAR help to highlight municipal problems in hospitals, with street crime, and in public services like firefighting or education. “Their understanding of the regional news agenda seems to be pretty much in tune with the kind of content we’re interested in,” Robinson explains. To get a feel for the range of content, you can see the feed of weekly advisories RADAR publishes to Twitter. It’s not just fluff—the content is filling a real gap in local coverage for under-resourced publishers.

Another benefit to using open government datasets is that they’re freely available. Government has already invested the money and effort to collect the data, and RADAR leverages  the data to add value for newsrooms.. While RADAR has filed a few public records requests to get data, that labor-intensive process often involves a lot of data tidying once it’s received and may not spark joy. Rogers says RADAR isn’t looking to get into gathering their own data just yet. “There’s so much available for us in public data that we’re not even touching,” he explained.

RADAR initially offered its wire service for free, thanks to a grant from the Google Digital News Initiative in 2018. But now it, too, is looking to become sustainable and is starting to sign on paying customers. Rather than charge based on the number of stories published, its pricing model is based on how many local geographic areas a publisher wants to cover. When they pay for those areas they can then access and publish as much of that content as they want.

This is a pivotal period for the company. If RADAR survives the transition to a paid service, it will have demonstrated a sustainable model employing just five data journalists that is dialing up the amount of meaningful local coverage throughout the UK.

Nicholas Diakopoulos is an assistant professor at the Northwestern University School of Communication, the author of the book Automating the News: How Algorithms Are Rewriting the Media, and a regular contributor to CJR.