Background Cachexia is a frequent, specific metabolic syndrome that severely compromises survival in gastric cancer (GC). While early diagnosis is paramount, existing screening methods are limited by complexity and suboptimal accuracy. There is an urgent need for an efficient, data-driven tool derived from routine clinical parameters. Methods In this multicenter retrospective study, we analyzed data from three independent hospitals. Variable selection was performed using univariable and multivariable analyses. We constructed and compared multiple machine learning (ML) models to predict cachexia risk. The models’ discriminative ability, calibration, and clinical net benefit were comprehensively evaluated via AUC, calibration plots, and Decision Curve Analysis (DCA). Results The study included 1,570 GC patients (cachexia prevalence: 30.3%). Patients were divided into training (n=920), internal testing (n=350), and external validation (n=300) cohorts. Cachexia was significantly associated with poor nutritional status, elevated inflammation, and inferior overall survival (P 0.01). The Random Forest (RF) model yielded the best performance, maintaining excellent stability across the internal test set (AUC = 0.898) and external validation set (AUC = 0.913). To enhance clinical utility, we further derived a simplified decision tree model based on three accessible markers: CA19-9, CEA, and albumin. This simplified tool retained high diagnostic accuracy (AUC 0.783) and demonstrated significant positive net benefits in DCA. Conclusion We successfully established and externally validated a high-performance ML model for predicting GC-associated cachexia. Crucially, the derived simplified decision tree offers a convenient, highly generalizable tool for clinicians to identify high-risk patients using routine laboratory tests, enabling earlier precision nutritional management.
Zhao et al. (Tue,) studied this question.