July 16, 2025

Beyond Accuracy: Multidimensional Evaluation of Large Language Models in Hepatocellular Carcinoma Management Emphasizing Prompting

Key Points

Large language models exhibit potential in managing hepatocellular carcinoma, especially with effective prompting.
Models like ChatGPT-4o and Grok-3 showed high accuracy rates of 93% and 95% respectively.
Prompting significantly improved model performance, indicating the importance of structured instructions for clinical tasks.
Findings highlight the need for further research to optimize language model usability in oncology.

Abstract

Abstract Background 2.60 ± 0.06, 95%) and interpretability (0.43;0.43). Prompting significantly improved accuracy ( p < 0.001) and interpretability ( p < 0.001) across all models. Semantic consistency declined slightly in most models; information entropy generally increased; readability changes varied. Conclusions This study presents the first multidimensional evaluation of large language models in hepatocellular carcinoma–related clinical tasks. General-purpose models outperformed some medical models, revealing limitations in domain-specific fine-tuning. Prompt design strongly influenced model performance. Further research should integrate diverse prompt strategies and clinical scenarios to improve the usability of language models in real-world oncology settings. Lay summary This study evaluated how well-advanced language-based artificial intelligence models can answer clinical questions related to hepatocellular carcinoma. The results showed that some models, especially when guided with structured instructions, provided accurate and understandable responses. These findings suggest that such tools may help improve communication and access to information for both doctors and patients managing liver cancer.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Jianchen Luo

Jing Ma

Tao Wang

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Beyond Accuracy: Multidimensional Evaluation of Large Language Models in Hepatocellular Carcinoma Management Emphasizing Prompting

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study