Although research evaluators and scientometricians have promoted the message of responsible bibliometrics through initiatives like the Leiden Manifesto, these do not mention Large Language Models (LLMs). LLMs can now make useful quality predictions for journal articles, giving values that correlate more strongly with expert judgements than do citation-based indicators in most fields. This has created the possibility that they could supplement or even replace citation-based indicators for some applications. As tested so far, LLMs predict the quality rating that a human expert would give a paper. They do this by reading the quality level descriptions and then processing the article title and abstract. This raises multiple new issues in comparison to the Leiden Manifesto. First, authors might try to trick LLMs into giving high scores by crafting LLM-friendly abstracts. Second, LLM models incorporate billions of parameters, so their scores are opaque. Third, it is not clear how LLMs work in terms of the main influences on their scores, so their biases are unknown. Fourth, whilst citations reflect tangible and permanent contributions to the scientific record, albeit of variable value, LLM-based predictions do not clearly link to progress. Fifth, LLM scores are ephemeral in the sense that newer LLMs may give substantially different scores and rankings.
Building similarity graph...
Analyzing shared references across papers
Loading...
Mike Thelwall
Building similarity graph...
Analyzing shared references across papers
Loading...
Mike Thelwall (Thu,) studied this question.
www.synapsesocial.com/papers/68a35efb0a429f7973328508 — DOI: https://doi.org/10.51408/issi2025_156
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: