In 2006 Adrian Holovaty, then a programmer and journalist of some reputation, wrote a blog post entitled, “A fundamental way newspaper sites need to change.” In the five years since he published it, Holovaty went on to win a Knight News Challenge grant, launch EveryBlock, sell it to MSNBC.com, and become one of the leading programmer/journalists working today. As the years passed, his post crystallized to become one of the more important, and prophetic, pieces of writing about what we now call data journalism.
So much of what local journalists collect day-to-day is structured information: the type of information that can be sliced-and-diced, in an automated fashion, by computers. Yet the information gets distilled into a big blob of text — a newspaper story — that has no chance of being repurposed.
what I mean by structured data: information with attributes that are consistent across a domain. Every fire has those attributes, just as every reported crime has many attributes, just as every college basketball game has many attributes.
This view has come to be accepted and championed by many important people and organizations. There are today many efforts to bring structure to all manner of information, and there’s of course lots of work left to be done. Notably, one slice of data that still lacks structure in the United States relates to journalists themselves.
We each have attributes like a phone number, e-mail address, title, beat, employment history, voting history, education history, Twitter username, published articles and reporting, frequently quoted sources… The list goes on.
These attributes don’t tell the whole story of a journalist, just as a box score doesn’t encapsulate a sports game. But they are material to the whole. And they are, for the most part, unavailable or at the very least disorganized and distributed. Unstructured, as Holovaty might say.
This reality was highlighted thanks to a recently launched effort by Ira Stoll, the former New York Sun vice president and managing editor who now runs FutureOfCapitalism.com. His new project is News Transparency, a website that seeks to act as a central database for the attributes of American journalists. Anyone can create a profile for a journalist or add to an existing profile. People can also offer feedback on the quality of a journalist’s work, or make note of a prediction made by the journalist.
“This site aims to improve the accuracy, quality, and transparency of journalism by making it easier to find out about the individual human beings who produce the news — human beings with opinions, relationships, history, and agendas,” reads the site’s about page. “That information should help readers, viewers, and listeners put what they are reading in better context, and it may even prompt some improvements by the journalists.”
News Transparency’s launch received a decent amount of press attention, with Poynter, Forbes, AFP and others writing about it. When I called Stoll recently to check in on the launch, he was on the other line with a French reporter. He said traffic on the site has been more international than expected.
That isn’t entirely surprising. The U.K. has been home to a similar site, Journalisted, since 2007. Journalisted describes itself as “an independent, not-for-profit website built to make it easier for you, the public, to find out more about journalists and what they write about.” Perhaps there are folks out there who have been waiting to see a little structure brought to American journalists.
Overall, the online trend is towards disclosing more information about journalists. News sites are putting journalist’s photos, e-mails, Twitter accounts, and other contact and connection information with a byline. Forbes’s new article page includes prominent merchandizing of the reporter’s information and most popular work. Google recently announced it will use the Google+ profiles of journalists to add byline information in Google News.
These are examples of very basic profile and contact information. In contrast, Journalisted offers analytics about what journalists cover, who they write about, how frequently they publish. These data have the potential to tell an interesting story about the people that cover the news—and therefore about the news itself. Imagine having access to a journalist’s commonly cited sources, basic information about their financial holdings, their most commonly covered topics, corrections to their work, their voting history, and so on.