A study that confidently declared OpenAI’s ChatGPT can improve student learning has been retracted, roughly a year after its publication, after Springer Nature spotted “discrepancies” in the analysis and lost faith in its conclusions. Not that the paper minded - it had already racked up hundreds of citations and enjoyed a glorious lap around social media before the plug was pulled.

“The paper’s authors made some very attention-grabbing claims about the benefits of ChatGPT on learning outcomes,” said Ben Williamson, a senior lecturer at the University of Edinburgh’s Centre for Research in Digital Education and the Edinburgh Futures Institute, in an email to Ars. “It was treated by many on social media as one of the first pieces of hard, gold standard evidence that ChatGPT, and generative AI more broadly, benefits learners.”

The retracted paper aimed to quantify “the effect of ChatGPT on students’ learning performance, learning perception, and higher-order thinking” by analyzing results from 51 previous studies. Its meta-analysis calculated effect sizes between experimental groups that used ChatGPT and control groups that did not, supposedly showing “a large positive impact on improving learning performance” along with a “moderately positive impact on enhancing learning perception” and “fostering higher-order thinking.” The findings first appeared in Humanities & Social Sciences Communications on May 6, 2025.

Williamson noted the paper appeared to be “synthesizing very poor quality studies, or mixing together findings from studies that simply cannot be accurately compared due to very different methods, populations and samples.” He also questioned the timing - just two and a half years after ChatGPT’s release in November 2022. “It is not feasible that dozens of high-quality studies about ChatGPT and learning performance could have been conducted, reviewed, and published in that time,” he said.

Since publication, the study was cited 262 times in Springer Nature peer-reviewed journals and 504 times total, attracted nearly half a million readers, and scored in the 99th percentile for attention. “All of the details about the study got stripped away,” Williamson lamented. “All that was left were the major claims, which certain social media users helped boost and propel.”

Ilkka Tuomi, chief scientist of Meaning Processing Ltd., had warned on LinkedIn about meta-analyses attempting to “draw conclusions about incompatible and ill-defined outcomes” from different populations. “The only reason to do these studies seems to be that statistics and meta-analysis tools can crunch out numbers that look [like] science,” Tuomi wrote.

On April 22, 2026, Springer Nature posted a retraction notice noting “concerns regarding discrepancies in the meta-analysis” and that “the authors had not responded to correspondence regarding the retraction.” Williamson shared the notice on Bluesky and LinkedIn, worried that many readers would miss the retraction and that “the headline finding that ChatGPT helps learning performance might persist despite its retraction.”

“All of this is hugely frustrating for those of us trying hard to make sense of what AI means for learning, teaching, and education more generally,” Williamson told Ars. “We have had several years of hype about AI in education, but what we have really needed is high-quality research that can actually show us what kinds of impacts AI is having in classrooms and learning practices.”

Meanwhile, educators scramble to prevent AI-enabled cheating, tech companies push “study mode” chatbots and SAT practice tools, and at least one country is reintroducing physical books and pen-and-paper learning. But hey, a retracted meta-analysis said ChatGPT is great, so who needs evidence?