What is the clinical evidence from this study?

Study design: Cohort. Population: Breast cancer (n=643). Intervention: XGBoost predictive model vs. Other machine learning models (LR, RF, SVM, GBM). Primary outcome: Prediction of adverse postoperative outcomes (AUC in external validation set) (95% CI 0.690-0.870).

What question did this study set out to answer?

This research aims to evaluate the effectiveness of machine learning models in predicting adverse postoperative outcomes following breast cancer surgery.

May 3, 2026Open Access

Perioperative machine learning models with SHAP interpretation for predicting adverse outcomes in breast cancer surgery

Key Result

The XGBoost model constructed using perioperative data effectively predicted adverse postoperative outcomes in patients with breast cancer undergoing surgery, achieving an AUC of 0.780 in the external validation set.

Key Points

This research aims to evaluate the effectiveness of machine learning models in predicting adverse postoperative outcomes following breast cancer surgery.
Collected perioperative data from 643 treatment-naïve breast cancer patients across two medical centers.
Developed five predictive models using algorithms including XGBoost and Random Forest after stratifying data into training and validation sets.
Evaluated model performance through AUC, calibration curves, and SHAP analysis to interpret key decision factors.
The XGBoost model showed optimal performance with an AUC of 0.840 in internal and 0.780 in external validation sets.
Specificity for the XGBoost model was 0.881 with an F1 score of 0.514, outperforming other models.
SHAP analysis identified the systemic immune-inflammation index, prognostic nutritional index, and age as the top predictive factors.

Study Design

Type

Cohort (n=643)

Multicenter

Yes

Structured PICO

Can machine learning models using perioperative data predict adverse postoperative outcomes in patients undergoing breast cancer surgery?

Population

643 treatment-naïve patients with breast cancer who underwent surgical treatment (443 in modeling set, 200 in external validation set from two independent medical centers)

Intervention

Machine learning models (Extreme Gradient Boosting [XGBoost], Random Forest [RF], Gradient Boosting Machine [GBM], Support Vector Machine [SVM], and Logistic Regression [LR]) using perioperative data

Outcome

Adverse postoperative outcomes / postoperative adverse prognosis

An XGBoost machine learning model using perioperative data, particularly the systemic immune-inflammation index, can effectively predict adverse postoperative outcomes in breast cancer surgery patients.

Limitations

Retrospective study design
Limited sample size
Class imbalance caused by the low incidence of positive events

Abstract

Objective To investigate the clinical value of a machine learning model constructed using perioperative data for predicting adverse postoperative outcomes in patients undergoing breast cancer surgery, and to identify key decision factors through SHAP interpretability analysis. Methods Perioperative core indicators and follow-up data from 643 treatment-naïve patients with breast cancer who underwent surgical treatment were retrospectively collected, including 443 cases in the modeling set and 200 cases in the external validation set, derived from two independent medical centers. The modeling set was stratified and split into training and internal validation sets in 7:3 ratio. After screening key variables using univariate analysis in the training set, five predictive models for postoperative adverse prognosis were developed based on Extreme Gradient Boosting (XGBoost), Random Forest (RF), Gradient Boosting Machine (GBM), Support Vector Machine (SVM), and Logistic Regression (LR) algorithms. The model performance was evaluated using the area under the receiver operating characteristic curve (AUC), calibration curves (CC), and decision curve analysis (DCA) in both the internal and external validation sets, and the feature contributions of the optimal model were interpreted using the Shapley Additive exPlanations (SHAP) method. Results The predictive model for postoperative adverse prognosis constructed using the XGBoost algorithm demonstrated optimal performance, showing strong discriminatory ability in both the internal (AUC = 0.840) and external (AUC = 0.780) validation sets. In the external validation set, its specificity (0.881) and F1 score (0.514) were higher than those of the other models. In addition, calibration analysis indicated good agreement between the predicted probabilities and actual incidence rates for the XGBoost model, and decision curve analysis demonstrated that it provided the highest clinical net benefit across most threshold ranges. SHAP analysis revealed that the top three variables contributing the most to the XGBoost model's prediction of postoperative adverse prognosis were the systemic immune-inflammation index (SII), prognostic nutritional index (PNI), and age, in descending order. Conclusion The XGBoost model constructed using perioperative data can effectively predict adverse postoperative outcomes in patients with breast cancer undergoing surgery, outperforming traditional models and other machine learning approaches. The preoperative SII level is the most critical predictive factor.

Mark Helpful

Bookmark

Relay

View Full Paper