What does this research mean for the field?

A Random Forest machine learning model incorporating clinical and laboratory features accurately predicts 60-day survival in patients with newly diagnosed advanced solid tumors, though its ability to predict short-term mortality remains limited. Novelty: ClaimNovelty.METHODOLOGICAL. Consensus alignment: ConsensusAlignment.NEUTRAL.

What question did this study set out to answer?

This study aims to predict 60-day mortality in patients with advanced solid tumors using machine learning techniques.

May 29, 2026

Machine learning modeling for predicting 60-day prognosis in newly diagnosed advanced solid tumors.

Key Points

This study aims to predict 60-day mortality in patients with advanced solid tumors using machine learning techniques.
Retrospective analysis of patients from Sheba Medical Center between 2017 and 2025.
Developed models (Logistic Regression, Random Forest, XGBoost) to predict 60-day survival based on 74 clinical features.
Used lab platform for data extraction and validated LLM accuracy against expert review.
Random Forest achieved the highest ROC AUC of 0.84 (95% CI 0.79-0.88).
Sensitivity for Random Forest was 0.89 (95% CI 0.85-0.92) with a specificity of 0.62 (95% CI 0.49-0.72).
Models had high positive predictive value for survival, but limited ability to predict short-term mortality.

Abstract

1622 Background: Accurate mid-term prognosis estimation in patients newly diagnosed with advanced, non-operable or metastatic cancer is critical for evaluation and treatment prioritization and palliative care planning. A 60-day time horizon is clinically meaningful, as major oncologic assessments typically occur at 2–3-month intervals. Existing prognostic tools have shown limited clinical adoption. Methods: We conducted a retrospective analysis of patients evaluated at the Sheba Medical Center Oncology Rapid Diagnosis Clinic between 2017 and 2025. Eligible patients had metastatic or locally advanced, non-operable solid tumors. A raw dataset was generated using the MDClone platform (overall 60-day mortality ~18%). A case-enriched training cohort (n=524; 39.3% 60-day mortality) and an independent test cohort (n=343; 19% 60-day mortality) were randomly selected. Seventy-four features were included, encompassing demographics, laboratory values, ECOG performance status, and clinical variables extracted from physician notes using a large language model (LLM). LLM extraction accuracy was validated against expert manual review in 32 randomly selected cases (accuracy 0.91–1.0). Logistic Regression, Random Forest, and XGBoost models were trained with grid-search hyperparameter optimization to predict 60-day survival (y=1) versus death (y=0). Performance was evaluated on the independent test set. Wilson score intervals were used for proportion confidence intervals, and Hanley–McNeil method for AUC confidence intervals. Feature importance for the top-performing model was assessed using SHAP. Results: Performance of different ML models is summarized in table 1. Random Forest demonstrated the best overall performance. All models showed high positive predictive value for survival but limited ability to predict short-term mortality. The most influential Random Forest features included INR, CRP, albumin, ECOG status, total protein, LDH, emergency-related hospitalizations, age, neutrophil count, and history of cardiovascular disease. Conclusions: A Random Forest–based model accurately identifies patients likely to survive 60 days, supporting timely evaluation and initiation of oncologic treatment. Prediction of short-term mortality remains limited, suggesting intrinsic unpredictability of mid-term outcomes or the need for additional clinical or biological features. Metric (95%CI) Random Forest Logistic Regression XGBoost ROC AUC 0.84 (0.79-0.88) 0.81 (0.76-0.86) 0.80 (0.75-0.85) Sensitivity 0.89 (0.85-0.92) 0.82 (0.77-0.86) 0.80 (0.75-0.84) Specificity 0.62 (0.49-0.72) 0.62 (0.49-0.72) 0.63 (0.51-0.74) PPV 0.91 (0.87-0.94) 0.90 (0.86-0.93) 0.90 (0.86-0.93) NPV 0.57 (0.45-0.68) 0.45 (0.35-0.55) 0.43 (0.33-0.53) Accuracy 0.84 (0.80-0.87) 0.78 (0.74-0.82) 0.77 (0.72-0.81)

Bookmark

Cite This Study

Malyanker et al. (Wed,) studied this question.

synapsesocial.com/papers/6a192dbbfab5b468c4416a7a https://doi.org/https://doi.org/10.1200/jco.2026.44.16_suppl.1622

Bookmark