Abstract Background As the prognosis of gastric cancer has improved, the exploration of prognostic factors has become increasingly important. This study aimed to identify prognostic factors of gastric cancer using machine learning and statistical methods and to compare the effectiveness of different methodologies in identifying prognostic factors. Methods We conducted a retrospective cohort study of cancer research data from survivors of gastric cancer in Korea. Patients were followed up from the date of curative treatment of gastric cancer to the date of recurrence, cancer-specific death, or censoring. The Cox proportional hazards, random survival forest, XGBoost, and DeepSurv models were used to calculate the risk of recurrence and cancer-specific death. All the models were trained on 80% of the training set, and the concordance index was used for comparison with 20% of the test set. The SHAP value was used for variable interpretation in the machine learning models. Results A total of 11,029 gastric cancer survivors with a median follow-up time of 6.19 years were included. Remnant stomach after gastric cancer treatment, T stage and N stage were the most important features for recurrence and mortality according to both the Cox model and the machine learning model. All the models had a concordance index greater than 0.7 without large differences. Conclusions The machine learning model is not inferior to conventional statistical analysis models and offers greater flexibility, especially when statistical assumptions are violated. The key prognostic factors identified through this approach include residual stomach after treatment and cancer stage.
Lee et al. (Fri,) studied this question.