Rare diseases affect more than 300 million people worldwide 1 , 2 , 3 , yet timely and accurate diagnosis remains an urgent challenge 1 , 3 , 4 , 5 . Patients often endure a prolonged ‘diagnostic odyssey’ exceeding 5 years, marked by repeated referrals, misdiagnoses and unnecessary interventions, leading to delayed treatment and substantial emotional and economic burden 4 , 5 . Here we present DeepRare—a multi-agent system for rare disease differential diagnosis decision support 6 , 7 , 8 powered by large language models, integrating more than 40 specialized tools and up-to-date knowledge sources. DeepRare processes heterogeneous clinical inputs, including free-text descriptions, structured human phenotype ontology terms and genetic testing results to generate ranked diagnostic hypotheses with transparent reasoning linked to verifiable medical evidence. Evaluated across nine datasets from literature, case reports and clinical centres across Asia, North America and Europe spanning 14 medical specialties, DeepRare demonstrates exceptional performance on 2,919 diseases. In human-phenotype-ontology-based tasks, it achieves an average Recall@1 of 57.18%, outperforming the next best method by 23.79%; in multi-modal tests, it reaches 69.1% compared with Exomiser’s 55.9% on 168 cases. Expert review achieved 95.4% agreement on its reasoning chains, confirming their validity and traceability. Our work not only advances rare disease diagnosis but also demonstrates how the latest powerful large-language-model-driven agentic systems can reshape current clinical workflows.
Building similarity graph...
Analyzing shared references across papers
Loading...
Weike Zhao
Chaoyi Wu
Ye Fan
Nature
Harvard University
Shanghai Jiao Tong University
XinHua Hospital
Building similarity graph...
Analyzing shared references across papers
Loading...
Zhao et al. (Wed,) studied this question.
www.synapsesocial.com/papers/6997b911baf9c852d8c25e03 — DOI: https://doi.org/10.1038/s41586-025-10097-9