What question did this study set out to answer?

This research aims to develop a machine learning model and a simplified decision tree to predict the risk of cachexia in gastric cancer patients.

March 13, 2026Open Access

From complex algorithms to clinical practice: a multicenter machine learning model and simplified decision tree for predicting cachexia risk in gastric cancer

Key Points

This research aims to develop a machine learning model and a simplified decision tree to predict the risk of cachexia in gastric cancer patients.
Conducted a multicenter retrospective analysis involving three hospitals.
Performed variable selection using univariable and multivariable analyses.
Constructed and compared multiple machine learning models, particularly focusing on the Random Forest model.
Evaluated models' performance with metrics such as AUC and Decision Curve Analysis.
Involved 1,570 gastric cancer patients, with a cachexia prevalence of 30.3%.
The Random Forest model achieved an AUC of 0.898 in internal testing and 0.913 in external validation.
Identified poor nutritional status and elevated inflammation as significant risk factors for cachexia.
The simplified decision tree model maintained an AUC greater than 0.783, enabling high diagnostic accuracy.

Abstract

Background Cachexia is a frequent, specific metabolic syndrome that severely compromises survival in gastric cancer (GC). While early diagnosis is paramount, existing screening methods are limited by complexity and suboptimal accuracy. There is an urgent need for an efficient, data-driven tool derived from routine clinical parameters. Methods In this multicenter retrospective study, we analyzed data from three independent hospitals. Variable selection was performed using univariable and multivariable analyses. We constructed and compared multiple machine learning (ML) models to predict cachexia risk. The models’ discriminative ability, calibration, and clinical net benefit were comprehensively evaluated via AUC, calibration plots, and Decision Curve Analysis (DCA). Results The study included 1,570 GC patients (cachexia prevalence: 30.3%). Patients were divided into training (n=920), internal testing (n=350), and external validation (n=300) cohorts. Cachexia was significantly associated with poor nutritional status, elevated inflammation, and inferior overall survival (P 0.01). The Random Forest (RF) model yielded the best performance, maintaining excellent stability across the internal test set (AUC = 0.898) and external validation set (AUC = 0.913). To enhance clinical utility, we further derived a simplified decision tree model based on three accessible markers: CA19-9, CEA, and albumin. This simplified tool retained high diagnostic accuracy (AUC 0.783) and demonstrated significant positive net benefits in DCA. Conclusion We successfully established and externally validated a high-performance ML model for predicting GC-associated cachexia. Crucially, the derived simplified decision tree offers a convenient, highly generalizable tool for clinicians to identify high-risk patients using routine laboratory tests, enabling earlier precision nutritional management.

Bookmark

View Full Paper

Bookmark

View Full Paper

From complex algorithms to clinical practice: a multicenter machine learning model and simplified decision tree for predicting cachexia risk in gastric cancer

Key Points

Abstract

Cite This Study