Large‐scale population cohorts and biobanks are cornerstones of precision medicine, providing extensive multimodal data that afford unprecedented opportunities to elucidate the mechanisms of complex diseases. However, traditional biobanks have largely functioned as static data repositories, and prevailing analytical frameworks face critical challenges, including data heterogeneity, fragmented knowledge extraction, and a limited pace of clinical translation. This article advances a new paradigm of the “intelligent biobank,” repositioning these passive repositories as active “predictive and discovery engines.” We argue that the emergence of large language models (LLMs) offers a transformative approach to addressing these challenges. As a unified computational interface across data modalities, LLMs can harness advanced natural language understanding and generation to integrate genomic, phenomic, imaging, and other multidimensional data. This integration enables interpretable biomarker discovery, dynamic risk prediction, and mechanistic hypothesis generation. Such a transformation could substantially accelerate the translation of precision medicine from research into clinical practice.
Yin et al. (Sun,) studied this question.