Objectives: To map applications of large language models (LLMs) and generative AI across the infectious disease spectrum and identify gaps in the evidence base. Methods: This scoping review followed JBI methodology and PRISMA-ScR guidelines (protocol pre-registered on OSF). PubMed, Embase, Scopus, and Web of Science were searched (April 2026), supplemented by medRxiv/bioRxiv, OpenAlex, Semantic Scholar, and citation chaining. Studies evaluating LLMs or generative AI for any infectious disease task from November 2022 onward were eligible. Screening used an automated rule-based algorithm validated against a 20% stratified sample (92.8% agreement, Cohen's kappa = 0.70, PABAK = 0.86). Results: From 42,030 records, 516 studies were included. Of these, 503 (97.5%) did not assess hallucination or safety, and only 10 (1.9%) reported clinical deployment. Publication volume grew from 11 (2022) to 245 (2025). COVID-19 dominated (132, 25.6%), followed by HIV/AIDS (54) and antimicrobial resistance (53). No studies addressed neglected tropical diseases. Diagnosis (116, 22.5%) and education (91, 17.6%) were the most common tasks. ChatGPT was the most evaluated model (162, 31.4%), though most studies did not specify the model version. 472 studies (91.5%) remained at proof-of-concept. Discussion: Generative AI research in infectious disease is growing rapidly but remains concentrated on diseases prevalent in high-income settings, dominated by proprietary models with poor version reporting, and almost entirely pre-clinical. The near-complete absence of safety assessment and the zero-coverage gap for neglected tropical diseases are urgent concerns requiring minimum reporting standards, redirection toward high-burden diseases, and a shift from benchmark testing to implementation science. Protocol registration: https://doi.org/10.17605/OSF.IO/QE629. Data availability: Full extracted dataset, screening algorithms, and analysis code available at https://osf.io/xbtne. Version 2 changes (13 April 2026): Added Zhou et al. (2025) as comparator review; added PABAK (0.86) alongside Cohen's kappa; updated cross-reference to companion review (Zenodo DOI); minor wording improvements throughout.
Hayden Farquhar (Mon,) studied this question.