What question did this study set out to answer?

This review aims to assess the efficacy of imaging-based machine learning models in predicting responses to immune checkpoint inhibitors in non-small cell lung cancer.

May 30, 2026

Deep learning–based imaging models for predicting immune checkpoint inhibitor response in non–small cell lung cancer: A systematic review.

Key Points

This review aims to assess the efficacy of imaging-based machine learning models in predicting responses to immune checkpoint inhibitors in non-small cell lung cancer.
Conducted a systematic review of 97 studies with 255 models focused on immunotherapy response prediction using imaging-derived inputs.
Included studies utilizing CT, PET-CT, and whole-slide images for prediction, focusing on the best-performing model when multiple were reported.
Excluded pooled performance estimates and meta-analysis due to marked heterogeneity in model design and input data.
81% of the models used radiomics-based feature extraction combined with classical machine learning; end-to-end deep learning was used in only 15 studies.
Non-radiomics models had a higher average area under the curve (AUC) of 0.77 compared to radiomics-based models with an AUC of 0.73.
CT-based models showed superior performance (AUC 0.75) over PET-CT (AUC 0.66) and whole-slide images (AUC 0.62).

Abstract

e20582 Background: Imaging-based machine learning models have been widely studied in medicine. Aside from diagnostic models, advanced models predicting prognosis or treatment response has been further studied. Predicting response to immune checkpoint inhibitors (ICIs) in patients with non-small cell lung cancer (NSCLC) is one of the important challenges in medicine since response is heterogenous, and multiple studies were done to predict ICI response. However, substantial heterogeneity exists across studies in terms of imaging modalities, feature extraction strategies, or modeling approaches, limiting the interpretability and robustness of the results. Methods: We conducted a systematic review of imaging-based machine learning studies evaluating immunotherapy response prediction. Eligible studies used imaging-derived inputs, including CT, PET-CT, whole-slide images (WSI), to predict treatment response or clinical outcomes. When multiple models with small methodological variation were reported within a single study, only the most representative or best-performing model was included. Given marked heterogeneity in model design and input data, pooled performance estimates and formal meta-analysis were not performed. Results: A total of 97 studies comprising 255 models were included. Most models (81%) relied on radiomics-based feature extraction combined with regression or classical machine learning models, whereas only 23 models from 15 studies used end-to-end deep learning. In subgroup analyses, non-radiomics models (0.77) demonstrated a higher pooled AUC than radiomics-based models (0.73). CT-based models (0.75) were the most frequently studied and showed superior pooled performance compared with other modalities such as PET-CT (0.66) and WSI (0.62). Outcome definitions varied widely, and external validation was inconsistently performed. Conclusions: : This systematic review shows that imaging-based immunotherapy response prediction research is mostly radiomics-driven models combined with classical machine learning, while end-to-end deep learning approaches remain uncommon. This pattern likely reflects practical constraints, as many studies analyzed 3D imaging data and integrated clinical features in relatively small cohorts, which may have limited the usage of end-to-end training or fine-tuning strategies, making radiomics-based classical machine learning approaches a more practical choice. The relatively lower performance observed for PET-CT and WSI models likely reflects smaller sample sizes and limited validation rather than modality limitations. Overall, heterogeneity in diagnostic modalities, modeling strategies, outcome definitions, and external validation is observed and more studies using larger cohorts with external validation will be important in AI-based immunotherapy response prediction studies.

Bookmark

Deep learning–based imaging models for predicting immune checkpoint inhibitor response in non–small cell lung cancer: A systematic review.

Key Points

Abstract

Cite This Study