Sign up for The Media Today, CJR’s daily newsletter.
When an election season ramps up, newsrooms turn to polls to get a better understanding of both the state of the race and Americans’ attitudes about what’s on the ballot. These polls give journalists a sense of whether an upstart progressive has a shot at being the next mayor of New York or whether involving the US military in strikes against Iran might be unpopular among President Trump’s supporters.
But turning polls into meaningful insight isn’t straightforward. For instance, while one recent poll shows a progressive candidate within arm’s reach of winning the Democratic primary for mayor in New York City, an overview of all the polls about the race gives a more complete picture of just how close he may be. Aggregating polling data and interpreting it, however, is a laborious process that requires a specialized skill set: fluency in statistics, an understanding of the political polling landscape, and a daily commitment to finding, reading, and standardizing new data. Highly skilled researchers in newsrooms that aggregate polls often spend hours on rote tasks just to keep the data pipelines running.
That’s why, with support from the Brown Institute’s Magic Grant, researchers at the Tow Center for Digital Journalism are developing Pollfinder.ai, a tool that uses large language models (LLMs) to help polling aggregators discover, extract, and organize polling data more efficiently.
What is polling aggregation, and why is it so hard?
As Mary Radcliffe, a former senior researcher at FiveThirtyEight, puts it, “It’s completely infeasible…to code and catalogue every single issue question that pollsters ask. That would take more time than I think we have.”

Polling data also comes in all kinds of unstructured formats (e.g., press releases, images, PDFs), making data collection labor-intensive and time-consuming, especially during election cycles, when the volume of new polls can triple. “Our researchers are frequently having to sift through hundreds of news articles,” Radcliffe explains. “You’ll have ten news articles about the same poll with different headlines. You end up having to click on [the same poll] ten times over.” This tedious process overwhelms aggregation teams and leaves them little time to systematically capture the richer, and arguably more important, issue-based polling.
This is where we think LLMs can help.
Our approach
We are collaborating with Radcliffe to test how well LLMs can assist with these tasks. We have started developing two main components:
- The first is Poll Detector, which aims to assist with the time-consuming task of manually searching for new polls by using LLMs to scan articles to identify whether they contain polling data. If an article mentions a poll, the model extracts basic information about the poll (e.g., pollster name, sponsor name, date the poll was conducted, sample size) and determines whether the poll is new or already logged in the database.
- Our second AI-powered tool, Question Indexer, will extract and index the text of the questions asked in each poll, allowing us to build a text-searchable database of issue questions that pollsters are asking Americans.

But before releasing a product, we need to answer some research questions: How reliably can LLMs find and read polls?
Can LLMs find polls?
To evaluate whether LLMs can assist in discovering polls, we set up Google Alerts for keywords such as “poll AND (approve AND Trump),” which delivers a noisy feed of new articles that match these keywords every thirty minutes via RSS. Our Poll Detector then scans each article, extracts any polling data, and flags potential new polls for Mary to review.
(Whether you’re collecting polls, data about police shootings, or something else, this Google Alerts–powered newsroom data-collection workflow may sound familiar. If it does, and you’d like to experiment with integrating our infrastructure into that data-collection workflow, please contact us!)

Over the next few months, Mary will help check the accuracy of this data and update her database with the polls she discovered using Pollfinder.ai.
For our initial tests, we are focusing on approval-rating polls for Donald Trump and JD Vance. Approval polls are a common type of poll to aggregate in non-election years because the questions are frequently asked in consistent ways by different pollsters, making it easier to average and create trend lines over time.
We have been tracking polls using this approach since March 2025. While pollfinder.ai isn’t publicly available yet, we have some early wins. Mary tells us that she has found polls using our system that she would have otherwise missed.

We are currently working with Mary to reduce the noise in the feed, improve the accuracy of its data extraction and build a clean user interface.
Can LLMs read polls?
We also conducted a second experiment to evaluate the models’ ability to extract more detailed polling metadata. For this study, which we conducted in collaboration with political analyst Mark Blumenthal and presented at the eightieth annual conference of the American Association for Public Opinion Research (AAPOR), we used a 2024 transparency scoring dataset compiled by FiveThirtyEight and Blumenthal. The dataset includes detailed assessments of how polling organizations adhere to AAPOR transparency guidelines and serves as a benchmark for evaluating model accuracy.
Using a filtered sample of 156 polls from the dataset, we provided an LLM with original poll documents and tasked it with extracting both the relevant data points and the supporting source sentences for each transparency metric. The initial performance was promising across many fields and we found could be further improved with some additional prompt engineering:

Initial results from our two experiments suggest that although the outputs will contain errors, LLMs can take a first pass at surfacing relevant polls for polling aggregation and pre-filling structured metadata to assist with transparency scoring. For editors and researchers with domain expertise, this can significantly reduce the time spent on discovery and data entry. But manual verification is crucial, especially because models can misinterpret ambiguous formats or hallucinate details.
Pollfinder is not a fully automated solution, nor is it meant to ever be. As previous Tow Center research has revealed, LLMs still suffer from issues like the failure to accurately cite news content. We believe this process of polling aggregation will require human oversight and judgment for the foreseeable future. But an LLM-powered first pass at polling aggregation can save researchers like Mary hours of time each day, allowing them to focus their highly specialized skills on more nuanced tasks, like making decisions about how best to aggregate issue polling, vetting pollsters to verify their legitimacy, and conducting analysis that will further improve the quality of news.
As newsrooms begin to integrate AI-powered solutions into their workflows, they are faced with a choice: to use this technology as an opportunity to better leverage human expertise in ways that improve news, or to use it to replace humans and allow their news products to suffer as a result. We hope organizations using Pollfinder, for instance, will see the clear benefits to retaining talented researchers like Mary and her team and refocusing their saved time onto more ambitious projects like aggregating issue polling, rather than using automation as a justification for cutting staff.
Ultimately, such tools can be used to improve journalism and improve the quality of public discourse—or to undercut it. We hope newsrooms will pick the former.
Has America ever needed a media defender more than now? Help us by joining CJR today.