Abstract Background: Accurate individualized prognostication in advanced non-small cell lung cancer (NSCLC) is hindered by nonlinear interactions among clinical and biological variables. Integrating interpretable artificial intelligence (AI) with classical survival modeling may enhance predictive precision while preserving transparency. Methods: A real-world cohort of 62 608 advanced NSCLC patients with 52782 observed death events was analyzed. Prognostic variables significantly associated with overall survival (OS) were identified using univariate Cox proportional-hazards regression (p 0.05). Feature refinement was achieved through LASSO-regularized Cox modeling with 5-fold cross-validation (λ = 0.00176) and stability selection (≥ 0.7), producing a 20-feature prognostic signature. These variables were then used to train and benchmark 18 regression and ensemble machine-learning algorithms for continuous OS prediction. Model performance was evaluated using R2, RMSE, MAE, calibration plots, and decision-curve analysis (DCA). Feature interpretability was assessed with SHAP (Shapley Additive Explanations) to quantify the direction and magnitude of each predictor’s effect. Results: All 20 variables remained significant in multivariate Cox analysis. Chemotherapy, systemic therapy, and surgery were independent protective factors, whereas age, tumor size, metastatic burden, liver/bone metastasis, regional nodal involvement, and N stage predicted worse outcomes. Ensemble gradient-boosting models outperformed linear baselines (R2 ≈ 0.15 vs. 0.10; RMSE ≈ 18 months). LightGBM achieved the highest accuracy (R2 = 0.155; MAE = 11.9 months) with excellent calibration and the greatest net benefit on DCA. SHAP analysis identified chemotherapy, diagnosis year, organ metastatic number, and age as dominant determinants of predicted survival, with strong reproducibility across folds (Spearman ρ 0.9). Conclusions: A transparent Cox-LASSO-LightGBM-SHAP framework established a robust, biologically consistent 20-factor prognostic signature for advanced NSCLC. The model achieved high predictive accuracy, clinical calibration, and interpretability, revealing treatment modality, metastatic extent, and temporal therapeutic progress as principal survival drivers. This interpretable AI framework enables credible, individualized survival prediction and bridges data-driven modeling with clinical oncology. Citation Format: Kang Qin, An Qin, John V. Heymach. AI based explainable survival modeling for advanced non small cell lung cancer abstract. In: Proceedings of the American Association for Cancer Research Annual Meeting 2026; Part 1 (Regular Abstracts); 2026 Apr 17-22; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2026;86(7 Suppl):Abstract nr 4229.
Building similarity graph...
Analyzing shared references across papers
Loading...
Kang Qin
An Qin
John V. Heymach
Cancer Research
The University of Texas MD Anderson Cancer Center
Loyola University Chicago
Building similarity graph...
Analyzing shared references across papers
Loading...
Qin et al. (Fri,) studied this question.
www.synapsesocial.com/papers/69d1fdb0a79560c99a0a3cf4 — DOI: https://doi.org/10.1158/1538-7445.am2026-4229