Accurate risk stratification in oncology is essential for guiding treatment decisions, yet current algorithms rely on a narrow set of structured variables, and hence potentially ignore the rich signal in narrative pathology reports. These reports contain nuanced morphological descriptions and expert clinical judgment, yet this narrative information remains largely unused in clinical decision-making as it gets lost in “prose” text-based reports. We hypothesized that large language models (LLMs) could extract prognostic information from complete free-text pathology reports and convert it into a binary survival biomarker. We used the open-weight LLaMA 3.3 70B model to generate risk scores directly from publicly available pathology reports across three gastrointestinal cancer types. The model was prompted to synthesize the complete narrative reports into a binary prognostic score. We evaluated associations between the LLM-generated scores and survival outcomes, including overall survival, progression-free survival, and disease-specific survival. In colorectal cancer, LLM-generated risk scores demonstrated significant prognostic value for overall survival (Hazard ratio (HR) = 2.77, 95% confidence interval (CI) = 1.92–3.97, p < 0.001), progression-free survival (HR = 2.93, 95% CI = 2.11–4.08, p < 0.001), and disease-specific survival (HR = 5.85, 95% CI = 3.66–9.36, p < 0.001). Multivariate analysis confirmed the LLM-generated risk score as an independent prognostic factor for progression-free survival. LLMs can turn narrative pathology reports into a single, independent survival biomarker. This approach leverages routinely available free-text documentation without requiring additional tissue analysis or pathologist workload, providing a deployable method to enhance risk stratification for treatment decision-making.
Loeffler et al. (Fri,) studied this question.