the news frontier

Visualizing the Iraq War Logs

How The New York Times and The Guardian did it
October 27, 2010

When WikiLeaks gave seven news organizations access to 400,000 previously classified military documents pertaining to the Iraq war, one of many challenges that those organizations faced was how, if possible, to try to visually depict some of the information for their readers. This massive amount of new information called for more than photo slideshows, clearly, but each team took a slightly different approach based on its expertise, available staff time, and what content they chose to emphasize.

Al Jazeera English, for instance, made several basic charts, graphs, and maps for readers to click through, and an animated map of IED attacks across Iraq. Der Spiegel created an interactive map of a single day of violence, as did the Times and the Guardian (though they all chose different days). The New York Times’s maps of violence in Baghdad are, like most Times graphics, clean, informative, and straightforward. The Guardian’s Flash-animated and text-narrated map of one day in the fall of 2006, plays like a very understated documentary film, and is much more emotionally affecting.

Simon Rogers, editor of The Guardian’s Datablog and Datastore, who also spoke to CJR this summer after WikiLeaks released thousands of Afghanistan war logs, said that he and his interactive team felt more comfortable with the data this time around. There were more than four times as many documents this time, but the process of trial and error from the last time informed how they chose to filter and present this new information. For instance, before they attempted any visualization project, they found that they could analyze the data just by manipulating rows and columns in a spreadsheet, grouping the reports by type or time period.

They found, for instance, “that murder was the biggest cause of death, which is really indicative of the sectarian violence that was going on in Iraq,” Rogers said. “You can actually learn quite a lot from just doing a bit of basic work within Excel, with that large size of data.”

As Rogers did last time, he also released a small portion of their spreadsheet to the Guardian readers to download for themselves, to see what they could find from sifting through it. Jacob Shapiro, a professor of politics and international affairs at Princeton, quickly responded with his concerns about the limitations of the war logs as a whole, concerns which Rogers then posted and discussed on the Datablog. Rogers said he welcomes this kind of input, and is eager to see what readers do with the data, though he acknowledges that worthwhile analysis will take a sizeable commitment of time and attention.

“I don’t really know what people will do with it, taking it on,” Rogers said. “I suspect some of the real interesting stuff will come from academics, who’ve suddenly got access to this data, and will spend a long time looking at it. I don’t think it’s the sort of thing you can just dash off.”

Sign up for CJR's daily email

As for the question of how to best present the logs to Guardian readers, as an experiment, just to see what it would look like, Rogers first used Google Fusion tables and Google Maps, both free tools, to compile all of the deaths reported by the Iraq war logs. That map is on the Guardian website now, and Rogers said it has gotten many more page views than he expected. But he also said that he and his colleagues were trying to be careful not to “sanitize” the information. Each dot on the map represents not only a military event, but a human life gone. Dots on a map aren’t enough.

After sifting through their data, the team at The New York Times assigned to this project (Jacob Harris, Kevin Quealy, Sabrina Tavernise, and Andrew W. Lehren) also realized that they would need a way to focus the experience for their readers. “We didn’t have many preconceptions about what the graphic was going to be,” wrote Quealy, a Times graphics editor, in an e-mail. “We started with the data—a big file with about 17,000 GPS coordinates and dates—and worked from there.”

First, they mapped every death in Baghdad by its GPS coordinates onto a single map, and then divided it into six maps, one for every year from 2004 to 2009, over 32,000 deaths overall. “It became clear pretty quickly that year-by-year maps would show the sharp increase in violence in Baghdad in late 2006 and early 2007, so we thought the small ‘snapshots’ would be an effective way to present that,” wrote Quealy.

But when the big-picture maps didn’t feel like quite enough, the interactive teams at the Times and The Guardian each decided—independently of one another—to create another map, focusing on a single day of violence, though they chose different days to highlight.

The Times chose December 20, 2006 because, according to the data, it was one of the deadliest days of the conflict in Baghdad: “There were 114 separate episodes of violence that day, resulting in the deaths of about 160 Iraqi citizens and police officers.” Assistant editor of interactive news Jacob Harris said that Sabrina Tavernise, a Times national correspondent who worked on the project, had the idea to focus on a single day, “to provide a window into the horrific levels of violence without distancing the reader from the daily impact the way a yearly map does.”

Tavernise wrote in an e-mail that, even after narrowing it down, “digesting the data was fiendishly complicated.” She and Harris worked together on this part: each entry for that one day had to be hand counted and sorted by factors like grid coordinates, type of casualty, nationality of victim, and cause of death.

Especially important, of course, was comparing all of the reports to each other to make sure there were no duplicates. At first, said Harris, they had originally settled on a different day that month, which had appeared to be the most deadly. But then they found that the discovery of twenty-eight corpses had been recorded three different times, by three different military units. So where they had initially thought there were eighty-four deaths, there were actually twenty-eight.

This kind of duplication is common throughout the WikiLeaks reports, which is why Harris says they are careful to caution their readers about taking any hard numbers away from the story. While they could take the time to vet and tally a single day in December 2006, with its 114 separate violent events, they didn’t have the manpower to repeat that process for all 400,000 documents. (For an organization that’s working on doing just that, check out Iraq Body Count.) So while the Times team felt comfortable using the raw numbers to create the yearly-map snaphots, they deliberately did not post any hard numbers alongside those graphics on the site.

“We use the data to report on reality,” said Harris. “We don’t necessarily want to just report on what the data says.” In this instance, the data merely gave them an idea of the general pattern of violence in Baghdad across time, from 2004 to 2009. “It is powerful to see how the violence just swells like this,” he said. “But of course we have to be wary of where the data has potential issues.”

The Guardian data team chose October 17, 2006 for their “day in the life” of the Iraq war because it was “a typical day in one of the bloodiest years of the Iraq conflict – 136 dead Iraqis, 10 dead Americans and hundreds of violent incidents.” Guardian correspondent James Meek compiled and verified the entries from that day, and then guided readers through the logs from that day in an accompanying article. Then he and graphics designers Alastair Dant and Mariana Santos put together the Flash-animation map. Santos, a freelancer hired for the project, has experience with video documentaries as well as with Flash animation, which may have influenced the style of the presentation, said Rogers.

The effect of the map timeline is anything but “sanitized.” Like the IED timeline map that The Guardian made for the Afghanistan war logs, the emotional impact for the viewer builds up over time, as the casualties accumulate on the map. This time, though, it’s just one day, and the pace of the scrolling text of the incident summaries encourages (enforces, actually) a slow read. The timeline begins at midnight, when the text rolling across the screen reads:

00:00
There is already fighting when midnight comes around. Insurgents have rocketed a US infantry base south of Baghdad and set fire to oxygen tanks. The Americans fire back and a civilian is wounded by shrapnel in the chest.

The summaries continue throughout the next twenty-four hours, each in the same simple, readable language. The map slowly fills up with colored dots, different colors for different incidents: American death; Iraqi death; arrest; kidnap; bombs or grenades; explosions; firefights; shootings. A red numeral to the right of the map keeps a tally of the deaths that day, as the text keeps rolling. In a particularly subtle touch, the background of the map changes from dark gray, to light gray, to white, to gray again, mimicking the visibility at each time of day and night.

The Times’s map is much more static and simple than The Guardian’s. This might be because the Times team is currently quite distracted by the midterm elections next week, something that Harris and Quealy acknowledged. “We’re running on incredible deadlines all the time, so we try to best present the information with the amount of time we have,” said Quealy.

Something that both teams said they had to keep in mind, besides time and staff constraints, was the danger of visualizing and “interactivizing” (Rogers’s word) too much information. “What we’re increasingly trying to do is to simplify stuff, because you can really overcomplicate all this,” said Rogers. “Just because something can move, does it need to? Does it need to slide across the screen, when you can do the same thing without sliding across the screen, and do it better? We’re wrestling with that all the time, [asking ourselves,] Are we overdoing it?”

(For further reading: Kevin Quealy from the Times described the process behind one of their most popular infographics, a map of Netflix queues by city neighborhood, in a very interesting post for The Society for News Design’s website, which you can read here.)

Lauren Kirchner is a freelance writer covering digital security for CJR. Find her on Twitter at @lkirchner