Join us
Illustration by Ariel Davis

Can AI Tools Meet Journalistic Standards?

So far, the results are spotty.

Sign up for The Media Today, CJR’s daily newsletter.

Tech companies promise that AI tools can do more with less—so perhaps they can help news outlets survive declining subscription sales and evaporating advertising revenue. Certainly, AI is being used effectively by some journalists to crunch numbers at lightning speed and make sense of vast databases. That’s a big benefit, one that has contributed to prizewinning work in the public interest. 

But more than two years after the public release of large language models (LLMs), the promise that the media industry might benefit from AI seems unlikely to bear out, or at least not fully. 

Generative AI tools rely on media companies to feed them accurate and up-to-date information. At the same time, AI products are developing into something akin to a newsroom competitor, and a particularly problematic one at that: well-funded, high-volume, and at times unscrupulous.

We decided to survey cases of AI-produced text in the news industry with an eye on ethics. Can AI tools meet the standards of traditional reporting and publishing? Our research finds several recent instances in which AI tools failed to rise to the occasion. 

One of the primary problems with AI-generated text is that none of the most common AI software models—including OpenAI’s ChatGPT, Google’s Gemini, Microsoft’s Copilot, and Meta’s Meta AI—are able to reliably and accurately cite and quote their sources. These tools commonly “hallucinate” authors and titles. Or they might quote real authors and books, with the content of the quotes invented. The software also fails to cite completely, at times copying text from published sources without attribution. This leaves news organizations open to accusations of plagiarism.

Last year, Forbes called out the AI tool Perplexity for ingesting its article on Google CEO Eric Schmidt and turning the story into an AI-generated article, podcast, and video without any attribution to the outlet. On YouTube, depressingly, the Perplexity video outranked Forbes’s original story. When confronted by Forbes’s John Paczkowski on X, Perplexity CEO Aravind Srinivas blamed the incident on “rough edges” in the tool.

In a Wired article titled “Perplexity Plagiarized Our Story About How Perplexity Is a Bullshit Machine,” also published last year, Tim Marchman and Dhruv Mehrotra described how they prompted Perplexity to summarize a recent story they’d published. Marchman and Mehrotra found that in its response, Perplexity exactly reproduced one of their sentences as if it had generated the words—a move that appeared to them to be plagiarism. The legal experts Marchman and Mehrotra spoke with were split on whether the lifted sentence would qualify as willful infringement of copyright claims, but there were other problems: Perplexity’s AI data crawlers had seemingly gone around the AI blockers Wired had put in place to prevent the use of its content. 

Sign up for CJR’s daily email

Whether this type of generative AI production is legally considered plagiarism and copyright infringement—and therefore, whether media outlets should be paid for the ingestion of their work by generative AI tools—will likely be determined by several upcoming lawsuits. 

The New York Times, the Center for Investigative Reporting (which oversees Mother Jones and Reveal), Intercept Media, and eight media outlets owned by Alden Global Capital have filed lawsuits accusing OpenAI and Microsoft of violating copyright laws by ingesting their content. In the Times’ suit, filed in the Southern District of New York in December 2023, the outlet accuses OpenAI of trying “to free-ride on The Times’s massive investment in its journalism by using it to build substitutive products without permission or payment.”

The Times’ filing includes several pages of examples in which OpenAI’s ChatGPT copied text from its archives and reproduced this text verbatim for users. 

One possible—and worrying—outcome of all this is that generative AI tools will put news outlets out of business, ironically diminishing the supply of content available for AI tools to train on. 

Some media companies, including The Atlantic, Vox Media, FT Group, the Associated Press, and News Corp, have made deals with AI companies to license portions of their content. It’s worth noting that in May 2025, the Times signed an AI licensing deal with Amazon, which allows the tech company to use the outlet’s content across its platforms.

 Reporters and editors eye these deals warily. A week before The Atlantic announced its deal with OpenAI, Jessica Lessin, CEO of tech-journalism outlet The Information, warned against what she saw as a Faustian bargain with AI companies.

The tech companies “attempt to take all of the audience (and trust) that great journalism attracts, without ever having to do the complicated and expensive work of the journalism itself. And it never, ever works as planned,” Lessin wrote.

Generative AI results are only as good as the materials on which the systems are trained. Without a reliable way to distinguish between high- and low-quality input, the output is often compromised. In May 2024, Google unveiled its AI Overview, a tool meant to supplant search results. But it quickly proved flawed. The AI—seemingly regurgitating a twelve-year-old Reddit thread—produced a pizza recipe that included one-eighth of a cup of Elmer’s glue. In response to another query asking how many rocks someone should eat daily, AI Overview said “at least one small rock per day,” apparently sourcing its information from an Onion article.

Mistakes like these make these multibillion-dollar tools seem incapable of even basic common sense. 

The AI tools also contain biases that are not so easily visible. Emilio Ferrara, a professor at the University of Southern California and research team leader at the USC Information Sciences Institute, found biases in data used during the training of generative AI, in its learning processes, within the tool’s infrastructure, or during deployment. Some of these biases are implicitly expressed in the selection of training-data texts that contain existing cultural biases, in the types of data collected, and through confirmation bias—the way an AI may be trained to yield particular results. More explicitly, a model may also produce stereotypes.

Generative AI models “may inadvertently learn and perpetuate biases present in their training data, even if the data has been filtered and cleaned to the extent possible,” Ferrara found. Ultimately, these LLMs reflect the biases of the people who program their algorithms and the internet’s complex ecosystem of users and creators—as well as, sometimes, the limited availability of content on a particular topic or authored by a particular group or in a particular language.

The bias can be most profoundly illustrated with image-oriented generative AI tools, which have consistently generated troubling results for nonwhite and non-male subjects. Attempts to correct these biases have thus far been clunky, at best, as when Google’s Gemini was asked to produce an illustration of a 1943 German soldier and generated drawings of Asian and Black Nazis. Or when prompts to illustrate the Founding Fathers resulted in images of people of multiple ethnic backgrounds.

Sometimes, the bias feels almost satirical. The automated-news company Hoodline runs a group of hyperlocal websites based primarily on AI-generated local news feeds and uses AI to generate location-specific reporter personas. Putting the dystopian nightmare of the business model aside, the names the AI generated for its personas reflected stereotypes about the communities they were intended to represent. Boston “reporters” had stereotypically Irish names like “Will O’Brien” and “Sam Cavanaugh,” while San Francisco’s AI-generated staffers were given names reflecting the city’s diversity, among them Leticia Ruiz, Tony Ng, and Erik Tanaka, Nieman Labs reported. 

And then there are the user-side biases: primarily, a lack of understanding of AI’s limitations. 

The apparent convenience of putting large language models in the hands of consumers who will rely solely on that information is “pretty worrisome,” said Mark Lavallee, the Knight Center’s director of technology, product, and strategy. “If you ask a question a certain way, it’s going to answer it a certain way.”

Nearly everyone agrees that keeping a human “in the loop”—and close to any generative AI use, to monitor for misfires—is a key factor of ethical AI use. But it’s unclear what that will look like in practice. If a journalist uses an AI tool to analyze fifty pages of documents, for example, should the journalist then review all the documents to ensure the synthesis is accurate and unbiased? If the business side of a company sets up a deal for AI-sponsored content, who monitors the result?

Perhaps nobody knows the challenge better than Sports Illustrated. In late 2023, the tech site Futurism noticed that some of SI’s stories were written by people who didn’t exist. One fake byline, Drew Ortiz, had a bio claiming “he grew up in a farmhouse, surrounded by woods, fields, and a creek.” The headshot attached to his profile was an AI-generated image available for purchase on a site called Generated Photos. 

When Futurism inquired about the apparently fake writers, the company promptly deleted all content associated with those bylines. In a statement to Futurism, Sports Illustrated revealed that the content had been produced by a company called AdVon, which describes itself as “a digital commerce platform, developing trusted, SEO-optimized, user-centric AI and content solutions.” 

SI said AdVon had assured it that “all of the articles in question were written and edited by humans. According to AdVon, their writers, editors, and researchers create and curate content and follow a policy that involves using both counter-plagiarism and counter-AI software on all content.” The fake headshots and bios, AdVon claimed, were “to protect author privacy,” a move SI was quick to clarify they didn’t condone.

The robotic writing in some of the posts raised eyebrows. One of Ortiz’s shopping guides, for “Full-Size Volleyballs,” awkwardly explains: “Even people who don’t watch sports can easily understand the intensity and skill required to play volleyball whenever they watch clips.”

In an all-hands meeting the day after the Futurism article was published, SI executives informed their staff that they had terminated their relationship with AdVon. (Meanwhile, SI’s parent company, Arena Group, publicly disputed the claim that it had published AI-generated work.) The damage was already done. Sports Illustrated, one of the oldest and once most respected sports outlets, lost much of its credibility with staff and readers alike.

“Along with basic principles of honesty, trust, journalistic ethics, etc.—I take seriously the weight of a Sports Illustrated byline. It meant something to me long before I ever dreamed of working here. This report was horrifying to read,” wrote staff writer Emma Baccellieri on X, commenting on Futurism’s story. 

Two months later, after Sports Illustrated’s publisher announced it was in “substantial debt,” Baccellieri and nearly all her coworkers were laid off. Most, including Baccellieri, were soon rehired by SI’s new publisher.

Sean McGregor, founding director of the Digital Safety Research Institute and a member of the Partnership on AI, likens companies’ and newsrooms’ use of AI to the experience of riding in a self-driving car. As people become comfortable with the technology, they become inured to its inherent risks. 

“There’s a tendency in all places where automation is introduced, where, you know, it’s a great tool, empowering people, and then it gets to a point of adequate performance…where you no longer have the ability, because of the way that our brains work, to pay attention and to safeguard the system,” he said. 

Most consumers aren’t yet comfortable with the marriage of AI and news production. In 2023, Benjamin Toff from the University of Minnesota and Felix M. Simon from Oxford University’s Internet Institute surveyed 1,483 people about their attitudes toward AI. Their survey found that more than 80 percent believed news organizations should “alert readers or viewers that AI was used.” Among people who believed consumers should be alerted, 78 percent said news organizations should “provide an explanatory note describing how AI was used.”

AI has the potential to help journalists do their jobs more efficiently. Used wisely, it can be a marvelous reporting tool. But, undeniably, it also has the potential to misinform, falsely cite, and fabricate information. The role of journalists is to expose deception and misinformation, but AI, for all its promise, has made it exponentially more difficult for journalists—and ordinary citizens—to do just that. We would advise newsrooms and journalists to proceed with caution, but it may be too late for that.

Clarification: An earlier version of this article did not include the rehiring, by a new publisher, of most of the Sports Illustrated staff. 

Has America ever needed a media defender more than now? Help us by joining CJR today.

Julie Gerstein and Margaret Sullivan are contributors to CJR. Julie Gerstein is a research fellow at the Craig Newmark Center for Journalism Ethics and Security at Columbia University. She is a former executive editor of Business Insider. Previously, she served as BI's Singapore bureau chief and was an editor at BuzzFeed. Margaret Sullivan is the executive director of the Craig Newmark Center for Journalism Ethics and Security at Columbia Journalism School. She writes a weekly column for The Guardian US and publishes the American Crisis newsletter on Substack. Previously, she was the chief editor of the Buffalo News, public editor of the New York Times, and the Washington Post’s media columnist.