Abstract Background Adverse drug events (ADEs) are a major source of preventable harm. Inflammatory bowel disease (IBD) requires long-term multidrug management, making ADEs frequent and clinically significant. Extracting ADEs from electronic health records (EHRs) is central to pharmacovigilance but challenging due to overlapping disease activity and drug toxicity, complex polypharmacy, and heterogeneous Chinese clinical narratives. Conventional named entity recognition (NER)–relation extraction (RE) pipelines and fixed vocabularies often miss evolving expressions. Large language models (LLMs) show promise for ADE detection1, yet most current approaches are not end-to-end and rely on constrained annotation schemas, limiting scalability and real-world generalizability in IBD. Methods We developed an end-to-end LLM pipeline for open-vocabulary detection of ADEs following treatment with corticosteroids, immunomodulators, biologics, and small-molecule inhibitors in IBD. A total of 8406 IBD notes (Peking Union Medical College Hospital = 7936; Zunyi Medical University = 216; Guizhou Provincial People’s Hospital = 254) were annotated. The system directly reads clinical text, expands candidate events through knowledge-augmented retrieval, and normalizes outputs to MedDRA to ensure reliable and scalable pharmacovigilance. It integrates (i) a high-recall pre-screening module to retain plausible ADE signals while minimizing unnecessary LLM calls, (ii) graph-based retrieval over a drug–event bipartite network to broaden candidate scope, (iii) ensemble LLM inference guided by a self-learned instruction set, and (iv) ontology-aware normalization aligning terms with MedDRA and ensuring cross-center consistency. Results In the binary classification task of detecting the presence of any AE within a patient’s record, we ultimately select HYBRID + LR model for subsequent analyses which achieved Area Under the Curve (AUC) of 0.809 in CD test set(Figure1A), and 0.828 in UC test set(Figure1B). Our model then achieved the good performance on identifying drug-AE pairs: CD test set an F1-score of 0.577, a recall of 0.706, and a precision of 0.488 for the CD cohort; For the UC cohort, the model achieved an overall F1-score of 0.545, recall of 0.624, and precision of 0.484. We identified the top five ADEs in the IBD cohort: bone marrow suppression (n = 137), liver function abnormality (n = 134), C.difficile infection (n = 73), rash (n = 71), and paresthesia (n = 70). Conclusion The pipeline enables near real-time ADE detection and supports risk prediction in IBD. Embedding LLM-based pharmacovigilance in EHRs may deliver continuous safety surveillance and bridge clinical practice with regulatory science for data-driven, real-time monitoring. Reference: 1. Syrowatka A, Song W, Amato MG, et al. Key use cases for artificial intelligence to reduce the frequency of adverse drug events: a scoping review. Lancet Digit Health. Feb 2022;4(2):e137-e148. doi:10.1016/s2589-7500(21)00229-6 Conflict of interest: Ms. Wei, Yuge: None Ronghao, Li: None Gechong, Ruan: None Bai, Xiaoyin: None Yinghao, Sun: None Dejun, Cui: None Fang, Yan: None Huijun, Shu: None Xuemin, Yan: None Honglei, Liu: None Yang, Hong: None
Building similarity graph...
Analyzing shared references across papers
Loading...
Y Wei
L Ronghao
R Gechong
Journal of Crohn s and Colitis
Chinese Academy of Medical Sciences & Peking Union Medical College
Capital Medical University
Peking Union Medical College Hospital
Building similarity graph...
Analyzing shared references across papers
Loading...
Wei et al. (Thu,) studied this question.
synapsesocial.com/papers/697310b0c8125b09b0d20636 — DOI: https://doi.org/10.1093/ecco-jcc/jjaf231.1007