‘Automating the Boring Stuff’ with Pollfinder.ai

Sign up for the daily CJR newsletter.

When an election season ramps up, newsrooms turn to polls to get a better understanding of both the state of the race and Americans’ attitudes about what’s on the ballot. These polls give journalists a sense of whether an upstart progressive has a shot at being the next mayor of New York or whether involving the US military in strikes against Iran might be unpopular among President Trump’s supporters.

But turning polls into meaningful insight isn’t straightforward. For instance, while one recent poll shows a progressive candidate within arm’s reach of winning the Democratic primary for mayor in New York City, an overview of all of the polls about the race gives a more complete picture of just how close he may be. But aggregating polling data and interpreting it, however, is a laborious process that requires a specialized skillset: fluency in statistics, an understanding of the political polling landscape, and a daily commitment to finding, reading and standardizing new data. Highly skilled researchers in newsrooms that aggregate polls often spend hours on rote tasks just to keep the data-pipelines running.

That’s why, with support from the Brown Institute’s Magic Grant, researchers at the Tow Center for Digital Journalism are developing Pollfinder.ai, a tool that uses large language models (LLMs) to help polling aggregators discover, extract, and organize polling data more efficiently.

What is polling aggregation, and why is it so hard?

Individual polls, often based on only a sample of 400-1500 respondents, rarely offer a full picture of public opinion. To address this, in past election cycles, organizations like 538, RealClearPolitics, and Huffpost Pollster (two of which are no longer in operation) produced aggregated feeds and averages of polls that served journalists across the industry. They also vetted new pollsters to ensure data wasn’t fabricated, identified which pollsters had a historical track record of accuracy, and scored how transparent they were in disclosing their methodology. Several organizations are currently taking up various aspects of this work and producing data that journalists across the industry rely on.

But many aggregators focus primarily on compiling horse-race polls, which track which candidate is ahead or behind in a race, and often overlook issue polls, which explore what voters care about. “Horse-race polling is important in the run-up to elections, but issue polling has such an important role to play in a democracy, which is ostensibly based on the will of the people,” says Nathaniel Rakich, former senior editor and senior elections analyst at 538. “If the will of the people is to be followed, you need to know what the people think. Issue polls can help us do that.”

While the choice to deprioritize issue polling in aggregation is partly editorial, it’s also a result of limited bandwidth. Part of what makes collecting polling so time-consuming is the nature of polling and the various formats in which they are released. Horse-race polls are relatively standardized: most ask a simple question about which of the candidates in a given race the respondent would vote for, making it straightforward to aggregate and compare results. Issue polls, however, vary widely in question wording, context, and methodology, which makes them far more difficult to process. For example, different phrasings around the Affordable Care Act—’Obamacare’ vs. describing its provisions—can yield different results even if they’re asking about the same thing, said Rakich.

As Mary Radcliffe, a former senior researcher at 538, puts it, “It’s completely infeasible… to code and catalog every single issue question that pollsters ask. That would take more time than I think we have.”

The current process of aggregating polling data is painstakingly manual and time-consuming. It involves crawling through the internet using keywords, sifting through a Slack channel full of potential links, and then manually entering metadata into spreadsheets. (Illustration by Aisvarya Chandrasekar.)

Polling data also comes in all kinds of unstructured formats (ie. press releases, images, PDFs), making data collection labor-intensive and time consuming, especially during election cycles when the volume of new polls can triple. “Our researchers are frequently having to sift through hundreds of news articles,” Radcliffe explains. “You’ll have 10 news articles about the same poll with different headlines. You end up having to click on [the same poll] 10 times over. This tedious process overwhelms aggregation teams and leaves them little time to systematically capture the richer, and arguably more important, issue-based polling.This is where we think large language models (LLMs) can help.

Our Approach

We are collaborating with Radcliffe to test how well LLMs can assist with these tasks. We have started developing two main components:

The first is Poll Detector, which aims to assist with the time-consuming task of manually searching for new polls by using LLMs to scan articles to identify whether they contain polling data. If an article mentions a poll, the model extracts basic information about the poll (ie. pollster name, sponsor name, date the poll was conducted, sample size) and determines whether the poll is new or already logged in the database.

Our second AI-powered tool, Question Indexer, will extract and index the text of the questions asked in each poll, allowing us to build a text-searchable database of issue-polling questions that pollsters are asking Americans.

The Tow Center is working on a pipeline that uses LLMs to streamline the discovery of new polls and build a searchable database for researchers. (Illustration by Aisvarya Chandrasekar.)

But before releasing a product, we need to answer some research questions: How reliably can LLMs find and read polls?

Can LLMs find polls?

To evaluate whether LLMs can assist in discovering polls, we set up Google Alerts for keywords such as “poll AND (approve AND Trump),” which delivers a noisy feed of new articles that match these keywords every 30 minutes via RSS feeds. Our Poll Detector then scans each article, extracts any polling data from these articles, and flags potential new polls for Mary to review.

(Whether you’re collecting polls, data about police shootings, or something else, this Google Alerts-powered newsroom data-collection workflow may sound familiar. If it does, and you’d like to experiment with integrating our infrastructure into that data-collection workflow, please contact us!)

Poll Detector scans each article from a database of urls that match relevant keywords, extracts any polling data from these articles, and flags potential new polls for researchers to review. (Illustration by Aisvarya Chandrasekar.)

Over the next few months, Mary will help check the accuracy of this data and update her database with the polls she discovered using Pollfinder.ai.

For our initial tests, we are focusing on approval rating polls for Donald Trump and J.D. Vance. Approval polls are a common type of poll to aggregate in non-election years because they’re frequently asked in consistent ways by different pollsters, making it easier to average and create trendlines over time.

We have been tracking polls using this approach since March 2025. While pollfinder.ai isn’t publicly available yet, we have some early wins. Mary tells us that she has found polls using our system that she would have otherwise missed.

We conducted an experiment to test how well LLMs can read a poll’s methodology and extract relevant metadata compared with humans. The LLMs were able to reliably extract straightforward fields, but struggled with more nuanced or inconsistently reported fields unless guided with careful prompting.

Initial results from our two experiments suggest that although the outputs will contain errors, LLMs can take a first pass at surfacing relevant polls for polling aggregation and pre-filling structured metadata to assist with transparency scoring. For editors and researchers with domain expertise, this can significantly reduce the time spent on discovery and data entry. But manual verification is crucial, especially because models can misinterpret ambiguous formats or hallucinate details.

Pollfinder is not a fully automated solution, nor is it meant to ever be. As previous Tow Center research has revealed, LLMs still suffer from issues like inaccuracy and failure to accurately cite news content. We believe this process of polling aggregation will require human oversight and judgement for the foreseeable future. But an LLM-powered first pass at polling aggregation can save researchers like Mary hours of time each day, allowing them to focus their highly specialized skills on more nuanced tasks, like making decisions about how to best aggregate issue polling, vetting pollsters to verify their legitimacy, and conducting analysis that will further improve the quality of news.

As newsrooms begin to integrate AI-powered solutions into their workflows, they are faced with a choice: to use this technology as an opportunity to better leverage human expertise in ways that improve news, or to use it to replace humans and allow their news products to suffer as a result. We hope organizations using Pollfinder, for instance, will see the clear benefits to retaining talented researchers like Mary and her team and re-focusing their saved time onto more ambitious projects like aggregating issue polling, rather than using automation as a justification for cutting staff.

Ultimately, such tools can be used to improve journalism and improve the quality of public discourse, or undercut it. We hope newsrooms will pick the former.

‘Automating the Boring Stuff’ with Pollfinder.ai

About the Tow Center

More from CJR

The Law-and-Order Influencer

John Cusack Wants to Talk About Paywalls

When to Publish News of War

The Battle for Press Freedom in the Streets

About

Support CJR

Advertise