Independent journalism requires skepticism as a mandatory practice. No matter how truthfully a presidential administration conducts itself, reporters should always question, contrast, and complement official information to find the closest version to the truth.
But this task is proving to be particularly difficult in the Trump era. Journalists are facing the challenge of covering one of the most unusual and unreliable governments in modern history: President Trump disseminates lies, twisted facts, and changes in policy in real time through his Twitter account. His advisors send contradictory messages on sensitive national topics and change policies at the last minute, surprising even Cabinet members. Federal data vanishes from the “thin cloud” on matters such as climate change and the environment.
Despite—or perhaps because of—all of this, investigative journalism is flourishing and growing as it did during the Watergate days. However, this time, journalists are much better equipped for finding the truth independently, thanks to data and technology. The challenge for journalists is to thoroughly and selectively grasp the power of technology while upholding the profession’s core journalistic mission. To that end, the Columbia Journalism School is launching a Master of Science in Data Journalism that we hope will advance data journalism education and contribute to building the next generation of newsroom leaders.
So far, most of the explosive revelations about Trump’s persona, his campaign, his business deals, and personal relationships came from sources or whistleblowers who approach journalists to share information, or vice versa. Sources are invaluable to reporters and, in many cases, are the only means to the story. But they always have an agenda. They generally only share information that’s in the best interest of the organizations they represent. A journalist exclusively dependent on source leaks might be missing the full story.
Data can help a reporter fill in the missing pieces, and it also allows journalists to expose a universe beyond the one sources are willing to share. USA Today, for instance, recently published an analysis of 4,095 lawsuits involving Trump over the last three decades. These range from skirmishes with casino patrons, to million-dollar real estate lawsuits, to personal defamation lawsuits. The data not only exposes how litigious Trump is, it gives us an insight into why he ended up in court—as a plaintiff or as a defendant—and the trends and results of those legal cases.
Data for journalism goes beyond spreadsheets and charts. In today’s world, data can be extracted from pictures, videos, audio, and books. Using text mining, journalists can dig into Trump’s mind by analyzing the more than two dozen books he has published since 1987 when he launched The Art of the Deal. While it’s true you don’t need a sophisticated algorithm to figure out that his favorite writing topics are “winning” and “getting rich,” text mining and machine learning can help us build a better understanding of his personal values, anecdotes, timelines.
With data mining, one can collect and analyze personal and business relationships of the current president, as the free database LittleSis has been doing over the last few months. Applying network analysis tools to those data connections, journalists could find clusters, intersections between key players and, ultimately, the circle of persons he trusts the most.
Local journalism could especially benefit from designing data tools to automatically trace and detect local policy changes in education, health, immigration, politics, and the economy. They can also monitor public spending and procurement contracts and cross-reference the names of awardees with a local politician’s close circle of supporters and financiers.
Data tools applied to journalism offer a double benefit: These can be designed to automatically collect and process data in real time, and they can also serve as a vast knowledge source to analyze and understand trends, patterns, and outliers over the course of time.
Additionally, data brings transparency to the journalistic practice. When journalists share their data with the public, they put on display the seams of their work—their priorities and their decision making process—enabling citizens to evaluate their bias or lack thereof keeping journalism accountable.
During the past three decades, public records have been continuously digitized. Current technology is now available to analyze, process, and distill the knowledge buried in those records. Society deemed the preservation of these records to be in the public interest; the knowledge that can be derived from it depends on us, data journalists.