Abstract Background Patients with invasive breast cancer (IBC) account for the vast majority of breast cancer cases and exhibit significant heterogeneity; hence, it is necessary to develop a model that can accurately predict their long-term postoperative breast cancer-specific survival (BCSS). Methods We used data from the Surveillance, Epidemiology, and End Results (SEER) database between January 2010 and December 2020, and meanwhile enrolled an independent external cohort from Guangxi Medical University Affiliated Cancer Hospital (GMUACH). We constructed four prediction models (Random Survival Forest RSF, Survival Gradient Boosting Machine Survival-GBM, Survival Extreme Gradient Boosting Survival-XGBoost, and Least Absolute Shrinkage and Selection Operator-Cox proportional hazards model LASSO-Cox) to predict the 3-year, 5-year, 7-year, and 10-year BCSS of IBC patients. Results The RSF model developed in this study exhibited outstanding predictive performance, with a C-index of 0.824 (95% CI: 0.817–0.831) in the training set, 0.689 (95% CI: 0.670–0.707) in the internal validation set, and 0.716 (95% CI: 0.649–0.772) in the external validation set—outperforming its counterparts. Time-dependent Brier scores confirmed the model’s excellent calibration and high predictive accuracy. Decision curve analysis (DCA) further confirmed the model’s stable clinical utility, while Shapley Additive Explanations (SHAP) plots revealed the feature importance of prognostic predictors. Additionally, the RSF model demonstrated strong efficacy in stratifying patients into distinct BCSS risk subgroups. Conclusions In conclusion, this study developed an optimal prognostic model for predicting the long-term BCSS of IBC patients, which provides critical support for risk stratification in the clinical management of IBC patients.
Liu et al. (Sat,) studied this question.