August 1, 2025

Development of a Deep Learning Model to Predict 5-Year Mortality in Non-Small Cell Lung Cancer Using the Korean Central Cancer Registry (Preprint)

Key Points

The developed deep learning models showed strong predictive ability for 5-year mortality in non-small cell lung cancer, with an AUC of 0.875–0.879.
Permutation importance analysis revealed that staging significantly affected model performance, with pulmonary function tests and symptoms also contributing.
A diverse cohort was used, with clinical, genomic, and staging data split into training, validation, and test sets for robust analysis.
The use of a Cox proportional hazards model provided a solid baseline for comparing the performance of the deep learning models.

Abstract

BACKGROUND Non-small cell lung cancer (NSCLC) is one of the most common cancers and a leading cause of cancer-related mortality, making prognostic prediction clinically essential. Machine learning models are increasingly being utilized to assess prognosis; however, developing systems that combine high discrimination with clear, clinically interpretable reasoning remains challenging. OBJECTIVE To develop deep learning models that predict 5-We identified patients diagnosed between 2014 and 2017 who had complete clinical data, pulmonary function test results, histological information, genomic data, and staging details. After preprocessing, the cohort was divided into stratified training, validation, and test sets in a 70%:15%:15% ratio. Five models were tuned using Hyperband across ten predefined feature groups. The primary metric for evaluation was the area under the receiver operating characteristic curve (AUC); additional metrics reported included accuracy, F1 score, precision, and recall. Group-wise permutation importance was calculated for each model, and the concordance of importance rankings was assessed using the Friedman test. A Cox proportional hazards (CPH) model was utilized as a baseline comparator.year mortality in NSCLC using data from the Korea Central Cancer Registry (KCCR) and to quantify feature importance through permutation testing. METHODS We identified patients diagnosed between 2014 and 2017 who had complete clinical data, pulmonary function test results, histological information, genomic data, and staging details. After preprocessing, the cohort was divided into stratified training, validation, and test sets in a 70%:15%:15% ratio. Five models were tuned using Hyperband across ten predefined feature groups. The primary metric for evaluation was the area under the receiver operating characteristic curve (AUC); additional metrics reported included accuracy, F1 score, precision, and recall. Group-wise permutation importance was calculated for each model, and the concordance of importance rankings was assessed using the Friedman test. A Cox proportional hazards (CPH) model was utilized as a baseline comparator. RESULTS All five models yielded comparable discrimination on the test set (AUC 0.875–0.879; accuracy 0.796–0.822; F1 0.815–0.846). Permuting the 'Stage' group resulted in the most significant decrease in AUC, followed by 'Pulmonary Function Test', 'Symptoms', and 'Age'. The 'Gene Mutation' group had a modest overall impact but became more influential within the adenocarcinoma subset. The Friedman test showed no statistically significant differences in importance rankings across the models (p = .928). CONCLUSIONS A meticulously tuned, grouped-input deep learning framework offered reliable and interpretable predictions for 5-year mortality in NSCLC. Group-level permutation importance provided stable and reproducible insights into the clinical factors influencing risk, which may guide future model refinement and clinical decision-making.

Bookmark

Cite This Study

Lee et al. (Sun,) studied this question.

synapsesocial.com/papers/689a0c6be6551bb0af8cfec2 https://doi.org/https://doi.org/10.2196/preprints.80574

Also Consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

Bookmark