What question did this study set out to answer?

The research investigates how well large language models (LLMs) can facilitate scientific discovery in immunology by comparing their cognitive abilities.

March 7, 2026

Research Highlights

Key Points

The research investigates how well large language models (LLMs) can facilitate scientific discovery in immunology by comparing their cognitive abilities.
Compared five large language models in four evaluation frameworks
Evaluated models for data mining accuracy, hypothesis generation, experimental proposal, and inferring biological principles
Used case examples from systems vaccinology to assess performance
ChatGPT-4o, Microsoft Copilot, and SciSpace outperformed LLaMA-70B in data mining accuracy
All models showed proficiency in recalling factual information and recognizing patterns
No model generated sufficiently innovative hypotheses or experimental proposals, emphasizing the need for human oversight

Abstract

Assessing AI’s Cognitive Abilities for Scientific Discovery in the Field of Systems Vaccinology Rodriguez-Coffinet L, Kazmin D, Pulendran B. Sci Immunol. 2025;10:eadx1794 Rapidly evolving large language models (LLMs) are undoubtedly transforming the academic landscape. But how they can be effectively employed to enhance the capacity of investigative research, particularly in biologically complex fields such as immunology, remains controversial.1 On the one hand, LLMs offer researchers with unparalleled access to vast datasets, uncover hidden patterns and insights, and generate potentially novel and testable hypotheses. On the other hand, limitations in their training data and their inherent design prioritizing pattern prediction often result in a phenomenon called “hallucination” when false, and sometimes fabricated, information is presented with confidence. In addition, questions remain regarding whether LLMs are simply excellent synthesizers of multilayered datasets or they are true risk-taking hypothesis creators. In this study,2 the authors compared 5 prevailing LLMs in 4 evaluation frameworks. The 5 LLMs compared were ChatGPT-4o, ChatGPT-4.5, Microsoft Copilot, LLaMA-70B, and SciSpace. These 5 LLMs differed considerably in their architecture and training scope, therefore were anticipated to perform quite variably in the comparison. The 4 evaluation frameworks were (1) accuracy in data mining, (2) ability in hypothesis generation, (3) ability to propose appropriate experiments to validate hypotheses, and (4) ability to infer broader biological principles from the experimental results. To compare the 5 LLMs for the 4 evaluation frameworks, they used 3 distinct case examples in systems vaccinology, their area of subject expertise. The 3 case examples were (1) the role of general control nonderepressible 2 in antigen presentation by dendritic cells in vaccine responses, (2) the role of sterol regulatory element-binding protein in metabolic processes pertinent to B-cell responses to vaccines, and (3) the role of Toll-like receptor 5 in vaccine immunity following antibiotic-induced alterations in gut microbiome. In the study of all 3 case examples, ChatGPT-4o, Microsoft Copilot, and SciSpace consistently outperformed LLaMA-70B in accuracy of data mining. It is important to keep in mind that SciSpace is an LLM tailored specifically for scientific literature comprehension and citation retrieval. In addition, all LLMs demonstrated aptitude for factual recall and pattern recognition by successfully retrieving known mechanisms related to the biological question, and formulated coherent hypotheses, and suggested plausible experiments for testing the hypotheses. However, their outputs largely reorganized established findings or restated well-known mechanisms, and did not generated sufficiently insightful hypotheses or experimental proposals to warrant empirical testing. These results underscored their strength in systems-level analysis, but also highlighted their central constraint of lacking autonomously creative and risk-taking hypothesis generating capacity. Such a limitation reaffirmed the importance of human participation in the process of scientific inquiry. In all 3 case examples, ChatGPT-4.5 demonstrated the highest level of integrative reasoning, likely representing an advancement of this version in its refined reasoning algorithms. This study is a timely cross-sectional assessment of the currently available LLMs in their role in supporting scientific discoveries. It is clear that LLMs will continue to rapidly evolve; therefore, their applications in investigative research will also likely rapidly evolve. At the present time, available LLM models are sufficiently mature to be incorporated into our research workflows, and to form a solid foundation for future co-evolution. However, given their current limitations, a model of “hybrid intelligence” combining LLM analysis with human expertise3 is likely the most constructive path forward. It is also important to incorporate an “iterative dialogue” typical of human thought processes by using multilayered and adaptive prompts to better simulate real-world investigative workflows. A rational strategy is to build deliberate “skeptics” to challenge the initial LLM output with a goal to enhance the rigor of our research. Curing Autoimmune Diabetes in Mice With Islet and Hematopoietic Cell Transplantation After CD117 Antibody-based Conditioning P. Bhagchandarni et al., J Clin Invest. 2026; 136(1): e190034. https://doi.org/10.1172/JCI190034. Type 1 diabetes remains a paradigmatic autoimmune disease in which immune-mediated destruction of pancreatic β cells results in lifelong insulin dependence. Despite major advances in insulin delivery and glucose monitoring, curative therapies remain elusive. Islet transplantation can restore endogenous insulin secretion and improve glycemic stability, yet durable graft function is uncommon. Both alloimmune rejection and recurrent autoimmunity contribute to graft loss. Although hematopoietic stem cell transplantation–based tolerance strategies have the potential to address both immune barriers, their application in islet transplantation has been limited by the toxicity of conditioning regimens necessary for effective stem cell engraftment. To address this limitation, increasing attention has focused on conditioning approaches that permit hematopoietic engraftment without global cytotoxicity. Targeted strategies aimed at selectively depleting host hematopoietic stem and progenitor cells have therefore attracted increasing interest. CD117 (c-Kit), a receptor tyrosine kinase expressed on hematopoietic stem cells, represents a particularly appealing target. Preclinical work has shown that CD117 antibody–drug conjugates can achieve efficient stem cell ablation while preserving broader immune competence.1 Against this background, Bhagchandani and colleagues report a preclinical strategy combining anti-CD117–based conditioning with hematopoietic and pancreatic islet transplantation in autoimmune-prone mice. In this study,2 the anti-CD117–based conditioning regimen consisted of anti-CD117 monoclonal antibody combined with transient T-cell depletion, JAK1/2 inhibition, and low-dose total body irradiation to facilitate engraftment of major histocompatibility complex-mismatched donor hematopoietic cells. Female nonobese diabetic mice served as transplant recipients, whereas fully allogeneic C57BL/6 donor mice provided hematopoietic cells and islets. Donor hematopoietic chimerism was quantified by flow cytometry using strain-specific major histocompatibility complex class I markers to distinguish donor-derived from host-derived immune cell populations. The authors demonstrate that anti-CD117 conditioning supports donor hematopoietic engraftment without the conventionally used DNA-damaging agents. This engraftment permits the establishment of stable mixed hematopoietic chimerism, creating a platform for immune modulation in an autoimmune setting. Specifically, chimerism can promote central and peripheral tolerance to donor antigens and, in some experimental contexts, modulate autoreactive immune responses as well. In the context of islet transplantation, mixed chimerism—often in combination with regulatory T cell–based strategies—has been proposed as a means to address the dual challenge of alloimmunity and autoimmunity that undermines graft survival.3 In the current study, anti-CD117–based conditioning enabled durable donor hematopoietic engraftment and immune re-education in the nonobese diabetic model, establishing a tolerant immune environment without clinical evidence of graft-versus-host disease. Notably, this approached allowed reversal of established autoimmune diabetes in addition to simple disease prevention. In this preclinical setting, immune tolerance induced by hematopoietic chimerism was sufficient to protect transplanted allogeneic islets from ongoing autoimmune attack. These findings suggest that reconstitution of the immune system within a chimeric environment can attenuate autoreactive T-cell activity and support regulatory immune pathways. Although the precise balance between central deletion and peripheral regulation was not fully examined, these data align with longstanding observations that mixed chimerism can reshape immune recognition in a durable manner. At the same time, significant challenges remain. Mouse models do not fully capture the heterogeneity of human type 1 diabetes, and achieving stable mixed chimerism in humans without graft-versus-host disease is complex. The long-term safety of CD117 targeting, including potential effects on nonhematopoietic c-Kit–expressing tissues, will require careful evaluation. Future studies will need to explore whether similar tolerance mechanisms can be achieved using gene-edited hematopoietic cells and how these approaches might integrate with emerging β-cell replacement platforms. Overall, Bhagchandani et al provides a rigorous preclinical demonstration that nongenotoxic hematopoietic conditioning can support immune reprogramming sufficient to protect islet grafts from autoimmune destruction. By bridging transplant immunology, tolerance induction, and regenerative medicine, this work offers a framework for therapies aiming not at chronic disease management, but at durable immune-mediated control of autoimmunity.

KI fragen

Bookmark

Cite This Study

Krynychka et al. (Thu,) studied this question.

synapsesocial.com/papers/69abc2175af8044f7a4eb5a9 https://doi.org/https://doi.org/10.1097/tp.0000000000005680

KI fragen

Bookmark