With artificial intelligence (AI) becoming prevalent in more and more walks of life, it should not be a surprise to learn that large language models (LLMs) are widely deployed in academic writing too, as confirmed by a study published by Cornell University. ‘Delving into ChatGPT usage in academic writing through excess vocabulary’ by Dmitry Kobak, Rita González Márquez, Emoke-Ágnes Horvát, and Jan Lause have revealed that recent LLMs can generate and revise text with human-level performance, and have been widely commercialised in systems like ChatGPT.
These models come with clear limitations: they can produce inaccurate information, reinforce existing biases, and be easily misused. Yet, many scientists have been using them to assist their scholarly writing. How wide-spread is LLM usage in the academic literature currently? To answer this question, the researchers used an unbiased, large-scale approach, free from any assumptions on academic LLM usage. They analysed over 14mn PubMed abstracts from 2010 to 2024 and created a matrix of word occurrences across these abstracts and calculated the annual frequency of each word. By comparing the observed frequencies in 2023 and 2024 to counterfactual projections based on trends from 2021 and 2022, they identified words with significant increases in usage. These words, termed “excess words,” were then used to gauge the influence of LLMs.
The analysis revealed that certain words, especially stylistic ones like “delves,” “showcasing,” and “underscores,” showed marked increases in frequency, suggesting LLM involvement. The researchers quantified this excess usage with two measures: the excess frequency gap (the difference between observed and expected frequencies) and the excess frequency ratio (the ratio of observed to expected frequencies). They found a substantial rise in the number of excess words in 2024, coinciding with the widespread availability of ChatGPT. This increase was unprecedented, surpassing the vocabulary changes observed during the Covid-19 pandemic.
To estimate the extent of LLM usage, the researchers used the frequency gap of excess words as a lower bound. For example, the word “potential” showed an excess frequency gap, indicating that at least 4% of 2024 abstracts included this word due to LLM influence. By analysing abstracts containing words with excess usage, the authors obtained a lower bound of 10% for LLM-assisted papers in 2024. This approach provided a robust lower bound, acknowledging that the actual figure could be higher due to some LLM-processed abstracts not containing any tracked excess words.
The research highlights a significant shift in academic writing styles due to the advent of LLMs like ChatGPT. By developing a novel methodology to track excess word usage, the study provides compelling evidence that LLMs have had a notable impact on scientific literature, with at least 10% of recent biomedical abstracts showing signs of AI assistance. This underscores the transformative effect of LLMs on scholarly communication and raises important questions about research integrity and the future of academic writing.
LLMs have been around since the 1990s, but it wasn’t until the 2010s that their use became more popular. They have been used as the basis for AI assistants like Apple’s Siri and ChatGPT. Because of their ability to process vast sets of data there have been attempts to use them to accelerate progress across a range of industries. Early drug discovery is no exception and many AI-led biotechs have released LLM-related announcements.
LLMs are based on deep learning architectures, particularly transformers, which enable them to process and generate text with a high degree of coherence and relevance. LLMs, like GPT-4, are trained on vast amounts of text data from the Internet, books, articles and other sources. This training allows them to learn the complexities of language, including grammar, context and even some level of reasoning and common sense. The latter is a large claim, so scientists are right to be wary.
Related Story