What does this research mean for the field?

Large Language Models (LLMs) can effectively assist in knowledge consultation and diagnostic tasks within Traditional Chinese Medicine (TCM), but they struggle with capturing TCM's holistic paradigm and individualized diagnosis. Novelty: ClaimNovelty.SYNTHESIS. Consensus alignment: ConsensusAlignment.NEUTRAL.

What question did this study set out to answer?

This study reviews the tuning techniques and applications of large language models in Traditional Chinese Medicine.

February 21, 2026Open Access

Tuning and clinical application of large language models in Traditional Chinese Medicine: scoping review

Key Points

This study reviews the tuning techniques and applications of large language models in Traditional Chinese Medicine.
Conducted a scoping review following PRISMA-ScR guidelines
Systematically searched seven databases for relevant studies published until May 2025
Focused on identifying model characteristics, tuning techniques, and evaluation methods
Included 27 studies, with applications mostly in TCM knowledge consultation and diagnostic assistance
LoRA fine-tuning was the most employed technique, often paired with other methods
Evaluation methods included accuracy measures and human assessments

Abstract

Abstract Background and objective Large Language Models (LLMs) show significant potential in healthcare, but their application in Traditional Chinese Medicine (TCM) lacks systematic evaluation. This study aims to comprehensively review LLMs tuning techniques, data construction strategies, evaluation methods, and application scenarios in TCM clinical practice. Methods A scoping review following PRISMA-ScR guidelines was conducted. Researchers systematically searched seven databases for relevant studies published between database inception to May 2025. The analysis focused on identifying model characteristics, tuning techniques, data sources, evaluation methods, application domains and performance limitations to assess the current state and future directions of TCM-oriented LLMs. Results We included 27 studies (21 in English, 6 in Chinese). Application domains comprised TCM knowledge consultation (10 studies) and diagnostic assistance (13 studies), with 4 studies establishing TCM LLMs evaluation benchmarks. LoRA fine-tuning was most widely used (65.2%), often combined with prompt engineering (47.8%), continued pre-training (43.5%), and retrieval-augmented generation (39.1%). Most studies (87.0%) employed multiple technique combinations. Training data balanced theoretical knowledge (classics) with clinical experience (case records), though multimodal data remained severely insufficient. Evaluation methods were multidimensional, with accuracy (63.0%) and human assessment (77.8%) most frequently used. Specialized TCM evaluation benchmarks were gradually established. Current models excel at integrating heterogeneous knowledge, basic syndrome differentiation reasoning, and cross-language knowledge conversion, but show limitations in simulating complex TCM reasoning processes and individualized diagnosis. Conclusion Although TCM-oriented LLMs demonstrate effectiveness in knowledge consultation and diagnostic tasks, they face significant challenges in capturing TCM's holistic paradigm, data quality, and clinical evaluation. Future research should develop TCM-compatible model architectures, build standardized multimodal data ecosystems, strengthen clinical translation, and create evaluation frameworks that reflect TCM's diagnostic process.

Perguntar à IA

Bookmark

View Full Paper