How OkCupid is bolstering data journalism

Digital companies discover patterns in usage

In the introduction of Dataclysm: Who We Are (When We Think No One is Looking), published last month, OkCupid co-founder Christian Rudder notes that, while companies started compiling data for profit, resulting troves contain vast storytelling potential:

Twitter, Reddit, Tumblr, Instagram, all companies are businesses first, but, as a close second, they’re demographers of unprecedented reach, thoroughness, and importance. Practically as an accident, digital data can now show us how we fight, how we love, how we age, who we are, and how we’re changing.

Rudder stops short of saying his book, and the related OkTrends blog, are journalism, but some of his actions—sifting through information for conclusions and narratives—seem to fit the bill. Companies that aren’t journalism outlets, like Twitter, Facebook, and Nike, are increasingly finding stories in their data. While these stories are no doubt being produced to show why the company’s core products matter, much like a press release, the best company-driven stories still illuminate something previously inaccessible. Corporate self-interest and publicity may be the motivations for this type of storytelling, but like pure journalism, they still can say a lot about the world.

Rudder’s slicing of OkCupid data—which shows how people act online when, unlike on Facebook, nobody is watching—comes awfully close to being journalism. He gave a taste of his book with a blog post in July explaining how OkCupid experimented with users by playing with match percentages and yanking photos for a while, a follow-up on outrage over the A/B testing revealed by Facebook. Rudder also writes about how OkCupid data shows it’s tougher for black people than those of other races to get interest from other people on the site and examines how perceived attractiveness may affect the amount of attention a person’s profile receives. These are things that in the past might have been conveyed simply through anecdotes, but Rudder is using his company’s data to convert some assumptions into facts.

While OkCupid keeps its data analysis in-house, other companies have opened up theirs. It’s become a reflex at this point to turn to Twitter data to learn about how users experience television shows, sports, and big events. Twitter has its own data visualization unit, but it has realized that many of the best stories being told with Twitter data are coming from other media outlets which it publicizes through house account @twitterdata. Some of those stories are being told through tools like Twitter Reverb, a handy visualization tool the company created and allowed some media companies to access.

Not every company has used its data to tell compelling stories. Jawbone published—and media companies picked up—a piece with the obvious reveal that proximity to an earthquake disturbs sleep. But in early September, the company opened its health tracker data for developers to build upon. Third-party developers may be able to develop apps to make money off the data, but this access may be a boon to storytellers as well.

All these companies are revenue generators first, and conveying wider knowledge with their vast data is secondary. But as data accumulates that helps explain how and why things or people are, it’ll be interesting to see how these companies unleash their customer data to the public. If these stories help unveil a new understanding of the world, that could be just as valuable as good journalism.

Tanveer Ali is a Chicago-based journalist who is Chicago's data reporter and social media producer. He has reported for the Chicago News Cooperative, WBEZ, and GOOD Magazine, among others. A former staff writer at the Detroit News, he received a master's in journalism from the Medill School of Journalism.