What question did this study set out to answer?

The aim is to explore the applications of large language models and generative AI in infectious disease contexts and identify areas lacking research.

April 15, 2026Open Access

Applications of Large Language Models and Generative AI Across the Infectious Disease Spectrum: A Scoping Review

Key Points

The aim is to explore the applications of large language models and generative AI in infectious disease contexts and identify areas lacking research.
Conducted a scoping review using JBI methodology and PRISMA-ScR guidelines.
Performed literature searches across multiple databases including PubMed and Scopus from November 2022 onwards.
Used an automated rule-based algorithm for screening, achieving high agreement (92.8%).
Included 516 studies out of 42,030 records, with a significant focus on COVID-19, HIV/AIDS, and antimicrobial resistance.
Only 10 studies reported clinical deployment, while 503 did not assess safety features.
The majority of studies (472) were early-phase, remaining at proof-of-concept.

Abstract

Objectives: To map applications of large language models (LLMs) and generative AI across the infectious disease spectrum and identify gaps in the evidence base. Methods: This scoping review followed JBI methodology and PRISMA-ScR guidelines (protocol pre-registered on OSF). PubMed, Embase, Scopus, and Web of Science were searched (April 2026), supplemented by medRxiv/bioRxiv, OpenAlex, Semantic Scholar, and citation chaining. Studies evaluating LLMs or generative AI for any infectious disease task from November 2022 onward were eligible. Screening used an automated rule-based algorithm validated against a 20% stratified sample (92.8% agreement, Cohen's kappa = 0.70, PABAK = 0.86). Results: From 42,030 records, 516 studies were included. Of these, 503 (97.5%) did not assess hallucination or safety, and only 10 (1.9%) reported clinical deployment. Publication volume grew from 11 (2022) to 245 (2025). COVID-19 dominated (132, 25.6%), followed by HIV/AIDS (54) and antimicrobial resistance (53). No studies addressed neglected tropical diseases. Diagnosis (116, 22.5%) and education (91, 17.6%) were the most common tasks. ChatGPT was the most evaluated model (162, 31.4%), though most studies did not specify the model version. 472 studies (91.5%) remained at proof-of-concept. Discussion: Generative AI research in infectious disease is growing rapidly but remains concentrated on diseases prevalent in high-income settings, dominated by proprietary models with poor version reporting, and almost entirely pre-clinical. The near-complete absence of safety assessment and the zero-coverage gap for neglected tropical diseases are urgent concerns requiring minimum reporting standards, redirection toward high-burden diseases, and a shift from benchmark testing to implementation science. Protocol registration: https://doi.org/10.17605/OSF.IO/QE629. Data availability: Full extracted dataset, screening algorithms, and analysis code available at https://osf.io/xbtne. Version 2 changes (13 April 2026): Added Zhou et al. (2025) as comparator review; added PABAK (0.86) alongside Cohen's kappa; updated cross-reference to companion review (Zenodo DOI); minor wording improvements throughout.

Applications of Large Language Models and Generative AI Across the Infectious Disease Spectrum: A Scoping Review

Key Points

Abstract

Cite This Study