Join us
 

The Lede Quiz: ChatGPT Beat (Most of) You

But the bot made one big mistake.

May 23, 2025

Sign up for The Media Today, CJR’s daily newsletter.

Check out coverage from our AI issue.

As loyal Laurels and Darts readers might remember, way back on May 16, 2025, we tweaked our usual formula to align with CJR’s special coverage of AI and journalism. We selected five ledes, each from a different news organization, and asked readers to see if they could detect which came from the news site and which was generated by ChatGPT.

CJR colleagues Dean Pajevic and Michael Murphy tracked the results (anonymously!), and we can report that our readers have—as we teachers often say in our evaluations—done perfectly fine, but also have room to improve: The average score from nearly 1,500 responses was 59.87 percent. About a third of you got three out of five correct, while less than 10 percent each got all five right or all five wrong.

Which raised the question: How would ChatGPT do? So we gave it the same test, and it outperformed most of you humans, getting four out of five right. 

Only one lede tripped up ChatGPT. The bot thought that it had written the actual New York Times lede. And conversely, it concluded that the actual Times lede had been written by a bot. 

Here were the choices: 

  • Lede 1 (from the Times): The Wisconsin judge arrested last month and accused of helping an undocumented immigrant evade federal agents was indicted by a federal grand jury on Tuesday on charges of concealing a person from arrest and obstruction of proceedings.
  • Lede 2 (from ChatGPT): Wisconsin Judge Hannah Dugan has been indicted on charges of obstruction and aiding an immigrant in evading deportation, a move that has sparked significant legal and ethical controversy.

The bot was kind enough to explain its reasoning. It found that Lede 1’s tone was “straightforward, fact-based, and formal…just the who, what, when, and why.” That “matches ChatGPT’s tendency to produce clear, impartial reporting language.” And it used “specific legal language…a hallmark of AI-generated formality.”

In contrast, ChatGPT determined that its own prose (Lede 2) was human-generated because it was “more dramatic and interpretive…with a subjective framing typical of human reporters.” That version made “an effort to engage the reader with broader implications” and had “more journalistic flair and context setting, making it less dry than typical AI-generated news copy.”

I then had one more question for the bot. Why did it mess up this one lede? It responded, “The overlap between high-quality AI writing and human writing is now so close that certainty is no longer possible from style alone.”

Thank you, ChatGPT, for your completely accurate kicker.

Note: CJR will be off Monday for Memorial Day. As for Laurels and Darts, we’ll return to our usual, 100 percent human-generated format next week. If you have an item you’d like us to consider, please send it to laurelsanddarts@cjr.org. We can’t acknowledge all submissions, but we will mention you if we use your idea. For more on Laurels and Darts, please click here.

Has America ever needed a media defender more than now? Help us by joining CJR today.

Bill Grueskin is on the faculty at Columbia Journalism School. He has previously worked as founding editor of a newspaper on the Standing Rock Sioux Indian Reservation, city editor of the Miami Herald, deputy managing editor of the Wall Street Journal, and an executive editor at Bloomberg News. He is a graduate of Stanford University (Classics) and Johns Hopkins’s School of Advanced International Studies (US Foreign Policy and International Economics).