What question did this study set out to answer?

This research aims to identify risk factors and develop a machine learning model to predict postoperative complications after liver cancer resection.

May 16, 2026Open Access

Development and internal validation of a machine learning–based model for predicting postoperative complications after primary liver cancer resection

SZShuo ZhangFujian Normal University DHDu Chen HuiXinjiang Medical University ZLZhang Qing LongXinjiang Medical University

Key Points

This research aims to identify risk factors and develop a machine learning model to predict postoperative complications after liver cancer resection.
Retrospective enrollment of 2,389 patients undergoing liver cancer resection.
Data divided into a training set (70%) and a test set (30%) using stratified sampling.
Seven machine learning models were created and evaluated using multiple performance metrics.
Random forest model achieved an AUC of 0.843, accuracy of 0.851, and specificity of 0.907 in the test set.
Key predictors included surgical approach, alanine aminotransferase, and intraoperative blood loss.
Random forest model consistently provided greater net clinical benefit than alternative treatment strategies.

Abstract

OBJECTIVE: To identify risk factors for postoperative major complications after resection of primary liver cancer and to develop machine learning-based risk prediction models. We compared the predictive performance of multiple machine learning algorithms and evaluated the optimal model and its potential clinical utility. METHODS: We retrospectively enrolled 2,389 patients who underwent resection of primary liver cancer at the First Affiliated Hospital of Xinjiang Medical University between January 2013 and December 2024. According to the Clavien-Dindo (CD) classification, patients with CD grade ≥ III were defined as having major postoperative complications (n = 447, while those with CD grade < III were classified as the non-complication group (n = 1,942). The dataset was divided into a training set(70%,n = 1,672)and a test set (30%,n = 717) using stratified sampling. Robust predictors were identified by taking the strict intersection of features selected by three methods: least absolute shrinkage and selection operator (LASSO) regression, XGBoost-based recursive feature elimination(RFE),and the random forest-based Boruta algorithm. Based on the selected features, seven machine learning models-logistic regression(LR),support vector machine(SVM),decision tree(DT), random forest(RF),extremely randomized trees(ET),extreme gradient boosting (XGBoost), and light gradient boosting machine(LightGBM)-were developed, with Bayesian optimization used for hyperparameter tuning. Model performance was comprehensively evaluated using the area under the receiver operating characteristic curve(AUC),sensitivity, specificity, calibration curves, Brier score, and decision curve analysis(DCA).The optimal model was further interpreted using SHapley Additive exPlanations (SHAP),local interpretable model-agnostic explanations(LIME), and partial dependence plots/individual conditional expectation(PDP/ICE). RESULTS: Eight key predictors were identified from the intersection of the three feature selection methods: surgical approach, alanine aminotransferase, intraoperative blood loss (IBL), liver stiffness measurement (LSM), prothrombin time (PT),total bilirubin, albumin (ALB), and intraoperative blood transfusion. Among the seven models, the RF model demonstrated the best overall performance in the test set, with an AUC of 0.843, accuracy of 0.851, specificity of 0.907, negative predictive value of 0.909, Brier score of 0.128, and F1 score of 0.602. SHAP analysis indicated that LSM, surgical approach, ALB, and IBL were the most influential predictors of major postoperative complications. DCA further showed that, across a wide range of threshold probabilities, RF-based risk stratification consistently provided greater net clinical benefit than either the treat-all or treat-none strategies. CONCLUSION: The RF model achieved the best predictive performance and can accurately estimate the risk of major postoperative complications after resection of primary liver cancer. This model may serve as a useful clinical decision-support tool for perioperative risk stratification and individualized patient management.

Ask AI

Helpful

Bookmark

View Full Paper