While thousands of AI prediction models are published annually, few are adopted into routine practice, partly because improved statistical performance does not necessarily translate into meaningful impact on clinical decision-making. We conducted a prospective randomized multi-reader multi-case study to evaluate how a machine learning–based prognostic tool influences clinician performance in colorectal liver metastases (CRLM). In a prospective, randomized multi-reader multi-case trial (NCT07027605; Registration Date: January 1, 2025), 12 surgical oncologists assessed 166 retrospective CRLM cases with and without tool assistance in a crossed design with a 5-week washout. The primary endpoint was the difference in AUC for predicting 3-year mortality. Between January and July 2025, 12 readers completed 3984 assessments. Model assistance significantly improved the AUC for 3-year mortality prediction (mean difference 0.091; 95% CI 0.001–0.181; P = 0.048) and consistently improved accuracy across secondary prognostic endpoints. It also reduced decision time (2.53 vs. 3.04 minutes) and increased reader confidence. Benefits were greatest for junior to mid-level surgical oncologists. This exploratory study demonstrates that a machine learning prognostic tool can significantly improve accuracy, efficiency, and confidence in CRLM evaluation.
Chen et al. (Wed,) studied this question.