the news frontier

Q & A: David Plotz and Chris Wilson on Slate Labs

“When you build the data yourself, you can be fairly certain no one else is going to have the story.”
August 18, 2010

Last week, Slate launched Slate Labs, a collection of their “experiments in multimedia journalism.” Curated by programmer-journalists Chris Wilson and Jeremy Singer-Vine, the project is meant to both show off their past work—from maps to interactive charts to games—and to encourage reader feedback and participation. It even lives on a separate server from the rest of the site so no one has to worry about crashing the main site with a multimedia experience gone awry. The experiments range from the silly (The Dan Brown Plot Generator) to the stirring (an animated map of several oil-spill scenarios).

CJR assistant editor Lauren Kirchner spoke with Slate editor David Plotz and associate editor Chris Wilson about their goals for multimedia journalism, and why every journalist could benefit from a little computer programming knowledge. This is an edited transcript of those conversations, the first of two parts.

What’s your favorite interactive project that you’ve developed so far?

David Plotz: I love the lost jobs map, that one in particular was just a really vivid way to illustrate the depression, and it’s one that we’ve updated frequently. As a tool for seeing how the recession is hitting where it’s hitting, it’s marvelous. People love visual journalism, they love games, and I think in Chris [Wilson], we have somebody who is very masterful in understanding data and how to use it to really make it stand out and move people. Slate is obviously a magazine that was born in text, I suppose, but to really live and thrive in this world, we have to do this well.

How do you decide which stories are worth visualizing and making interactive?

DP: I think the best kinds are stories where there’s a lot of abstraction. With the oil spill, you had stories that were being told from the point of view of being at the well, or being at the marshland or something, the pointillist version of it, that’s very important, but telling it from 200 miles up in the atmosphere [with an interactive map] is in one sense more vivid, or at least a useful addition to the understanding of the story.

Sign up for CJR's daily email

Chris Wilson: Certainly any time there’s a big data element, that’s the low-hanging fruit. Maps are one of the most popular things. Any time you try to do any kind of data visualization with graphs and charts, there’s a little bit of a higher bar to get readers interested in it, it’s a bit of a harder sell. But maps, people just love. The job loss map that we did was the most popular thing that we’ve ever done, multimedia wise. My favorite was probably the map of the growth of the Tea Party movement.

I was going to ask about that Tea Party map. To track the growth of the movement, you went to Meetup.com and searched for events and groups tagged “Tea Party” and “politics.” That’s a very creative way to find data, but it’s not exactly an objective measurement. Is that a problem?

CW: That project is a great example of something that’s a bit of a risk in terms of time investment: it took a fairly long time to gather the data, and then it could have turned out that the data just didn’t made sense, or showed no real trend. Then even if it had been accurate, that wouldn’t have necessarily been interesting. There has to be a bit of a smell test. So we looked at the explosions of events around tax day, April 15, and it appeared to mimic reality in a believable way, such that it verified that there was some kind of story there.

How did you actually collect the data?

CW: Far and away the skill that I’ve gotten the most use out of, and the first thing I recommend to anybody who’s interested in this kind of stuff is “screen-scraping.” There’s a free program called screen-scraper, which lets you go through a bunch of Web pages and take the data that you want from them and put it into a spreadsheet. If I had tried to do that by hand, it would have been days and days of tedious copying and pasting. But using this very simple piece of software, you can teach a computer how to do the copying and pasting for you.

When you build the data yourself, you can be fairly certain that no one else is going to have the story, because no one else has the data. We’ve done that a couple of times, and it becomes proprietary because it’s very hard for another publication to replicate what you’ve done.

Another project, your “Ideological Media Map”, would be the opposite kind of process—you took raw data from a research project, but then made it visual in a way that would be much more accessible to readers than a spreadsheet. That seems to align with a lot of things that Slate does, such as “The Explainer” column.

CW: Sure, that’s taking something that’s out there and presenting it in an appealing way. To me, that’s a perfectly successful project, if you allow readers to access and understand data, even if they could download it themselves.

Even after you collect the data, building these interactive elements must be incredibly time consuming.

DP: One of the things that I hope Labs is going to do is to develop templates for particular kinds of projects, so that even if the data is a different set of data, that you’ve already got the map function and you already know how it’s going to work. So you can create templates and then plug in different kinds of data sets. So we’ll have the job-loss map, and then maybe next time it’s not a map of the country, it’s a map of the world, and it’s not jobs, it’s McDonald’s.

CW: These things do take a fairly long time, much longer than writing an article. But we aspire to build a code library, so that we have all these different tools that are unique to Slate that we can then deploy very quickly if we want to use them again.

Do you think that these experiments are worth spending time on even if they might not necessarily have journalistic merit? Some things I see on the Labs server seem to have no real purpose, they’re just a fun thing to play around with. For instance, this “Facebook Name Explorer” charting all of the first and last names of Facebook users.

DP: I tend to be very liberal about this. When you have people like Chris and Jeremy, they’re going to have lots and lots of ideas. Some of them are going to be hardcore investigative journalism, some are going to be playful. If there are things they want to play around with, but that aren’t necessarily going to win us a National Magazine Award, that’s okay.

CW: The name map thing is actually pretty interesting. It certainly doesn’t have any news peg, or any argument attached to it, but I do think that names have a kind of sociological importance. For instance, you can see that some last names are associated with first names from a lot of different national origins, and other last names are more strictly tied to very common Biblical names like David or Christopher. I think this could be a tool for people to play around with and come to different conclusions; it’s more than a curiosity in my mind.

So I guess some of these data projects can start out as experiments without specific goals attached, but can then actually generate story ideas, maybe with the help of your readers.

CW: Right. We can do a fairly bare-bones visualization tool for a huge amount of data, where I certainly don’t have the time to go through all 15,000 Facebook names and look for interesting things, but if you lay it out and let people explore it and then encourage them to email you, then you can enlist your readers in a way that is fun for them. Then if they come up with anything great, you say “reader Rob Jones found this” and they get their name in their publication.

To me, that’s a great example of—you wouldn’t call it crowdsourcing, because crowdsourcing to me is more like using your readers for manual labor. Although that sometimes works, too; readers are often more than happy to pitch in. For instance, the Guardian did a brilliant project on the scandal with the members of Parliament and their abuse of expense accounts. This huge, scanned PDF document came out, so the Guardian set up something where readers could go through it and flag whether given one of these 150,000 pages had anything interesting on it. The way they did it was just brilliant because readers could log in and get a score for how many pages they had gone through, and they made a leaderboard, that kind of thing.

So they got their readers to help by turning it into a game.

CW: Exactly.

Read the second part of this conversation here.

Lauren Kirchner is a freelance writer covering digital security for CJR. Find her on Twitter at @lkirchner