Data reveals who isn’t talking about terrorism

This week's Data Darts and Laurels

Charles Ornstein, a senior reporter for ProPublica, wrote a comprehensive story for The New York Times’ Upshot about the drug Achtar, which cost the Medicare program about $141.5 million in 2012 despite being prescribed just 3,387 times under Medicare that year. Ornstein’s story is a prime example of how to find stories inside datasets—ProPublica has produced a number of stories from the Medicare prescription data it acquired via a Freedom of Information Act request—and explaining them through additional reporting. Ornstein gets a LAUREL.

FiveThirtyEight had a good week, showing the illuminative power that data could play in explaining broad issues. Two stories particularly stand out. In one, statistician Emma Pierson dives into more than 938,000 scientific papers to analyze how much credit women get in such writing. Though the piece comes off a bit wonky—it is about academia, after all—it’s statistical analysis clearly shows a gender gap in the science writing world. Pierson also comes to another important conclusion that “the rise of big data has made it far easier to study gender inequality.”

In another piece on the site, Hayley Munguia analyzes how the words “terror” or “terrorism” are uttered more rarely than back in 2001 both in Congress and by the sitting president. Data journalism that analyzes how many times a certain word is uttered has become awfully popular these days. (See this piece from The Verge about how many times a derivative of a certain four-letter word was mentioned in public comments about net neutrality.) Munguia’s piece wonderfully uses data, gathered from public transcripts, to show how an issue has faded over the course of time. This week, FiveThirtyEight gets multiple LAURELS.

The Washington Post’s Wonkblog produced a map of marijuana usage by state this week, culled from data from the latest National Survey on Drug Use and Health. The map certainly is interesting—Kansas reports the lowest percentage of users while Rhode Island has the highest—but the overall treatment needs a bit more skepticism and explanation. Chances are no matter how anonymous the survey is and how much Americans as a whole support legalization of marijuana more than they used to, the number of people reporting themselves as users are likely too low. There needs to be some discussion of this in the piece accompanying the map. Furthermore, the accompanying piece mentions that Alaska and Oregon rank high in usage reported—second and fourth among states respectively—noting that both states may legalize marijuana come fall. The piece does not mention Rhode Island (first) or Vermont (third), and the high reported usage in these states goes unexplained. This lack of explanation renders the map useless as the reader is left wondering what makes Vermont and Rhode Island so special compared with the other 48 states. This Wonkblog map gets a DART.

We end this week’s look at data journalism by bringing attention to a speech Attorney General Eric Holder gave in Philadelphia about the dangers of basing prison sentences on big data. In the speech, he said:

“By basing sentencing decisions on static factors and immutable characteristics—like the defendant’s education level, socioeconomic background, or neighborhood—they may exacerbate unwarranted and unjust disparities that are already far too common in our criminal justice system and in our society”

Data journalists should take on this opportunity, exploring how big data is currently used in sentencing and how it may change outcomes if it is used uniformly nationwide.

Tanveer Ali is a Chicago-based journalist who is Chicago's data reporter and social media producer. He has reported for the Chicago News Cooperative, WBEZ, and GOOD Magazine, among others. A former staff writer at the Detroit News, he received a master's in journalism from the Medill School of Journalism. Tags: , , , ,