Photo: depositphotos
The spread of generative artificial intelligence has sharply increased the productivity of researchers worldwide. At the same time, it has become increasingly difficult for editors and reviewers to distinguish genuine scientific breakthroughs from well-edited but low-substance texts, ScienceDaily reports.
After the release of ChatGPT at the end of 2022, researchers began reporting higher personal productivity. In parallel, editors of scientific journals noticed an influx of well-written papers with questionable scientific value. The authors of a study published in the journal Science concluded that large language models, including ChatGPT, are changing how manuscripts are prepared. They increase the number of publications while simultaneously making it harder to assess their quality.
“This is a very widespread pattern across many scientific fields—from the physical and computer sciences to the biological and social sciences. Our current ecosystem has undergone a major shift that requires very serious consideration, especially by those who decide which science should be supported and funded,” said Yan Yin, an associate professor at Cornell’s College of Computing.
Yin’s team analyzed more than 2 million preprints published between 2018 and 2024 on arXiv, bioRxiv, and SSRN—platforms that cover the physical, biological, and social sciences and host papers that have not undergone peer review.
The researchers compared texts written before 2023 with those likely generated using AI and developed a model to detect the use of large language models (LLMs). They tracked authors’ publication dynamics and examined whether these papers were later accepted by journals. On arXiv, researchers who likely used LLMs published about one-third more papers. On bioRxiv and SSRN, the increase exceeded 50%.
The strongest effect was observed among authors for whom English is not a native language. Researchers from Asian institutions published 43–89% more papers after they began using LLMs.
The authors also identified AI’s impact on literature searches. Tools such as Bing Chat more often suggested newer and more topically relevant sources than traditional search engines. “People who use LLMs gain access to more diverse knowledge, which may encourage more creative ideas,” said lead author Keigo Kusumegi. He plans to examine whether this is linked to more innovative and interdisciplinary science.
Although language models allow researchers to write more, they undermine traditional criteria for evaluating quality. Previously, complex language was seen as a marker of strong research and increased the chances of publication.
Now that marker no longer works: reviewers often recognize AI-generated papers through overly polished or complex phrasing and reject them, even if they are written flawlessly. A well-polished text is no longer a guarantee of scientific value. Yan Yin warns that this could have serious consequences for science, as publication counts will increasingly fail to reflect researchers’ real contributions.
Going forward, the team plans to investigate causal relationships through controlled experiments. Yin is also organizing a symposium in March 2026 dedicated to the impact of generative AI on science.
“Already, the question is no longer whether you used AI. The question is how you used AI and whether it was helpful or not,” the researcher concluded.