What question did this study set out to answer?

Evaluate the impact of a machine learning prognostic tool on clinician performance in predicting mortality for colorectal liver metastases.

April 10, 2026Open Access

Impact of an AI prognostic tool on clinician performance in colorectal liver metastases

Key Points

Evaluate the impact of a machine learning prognostic tool on clinician performance in predicting mortality for colorectal liver metastases.
Conducted a prospective randomized multi-reader multi-case trial
Involved 12 surgical oncologists assessing 166 colorectal liver metastases cases
Compared assessments with and without tool assistance
Focused on AUC differences for predicting 3-year mortality
Collected data from January to July 2025 involving 3984 assessments.
Model assistance improved AUC for 3-year mortality prediction by a mean difference of 0.091 (P = 0.048)
Increased accuracy across secondary prognostic endpoints
Reduced decision time from 3.04 to 2.53 minutes
Increased reader confidence, especially among junior to mid-level oncologists.

Abstract

While thousands of AI prediction models are published annually, few are adopted into routine practice, partly because improved statistical performance does not necessarily translate into meaningful impact on clinical decision-making. We conducted a prospective randomized multi-reader multi-case study to evaluate how a machine learning–based prognostic tool influences clinician performance in colorectal liver metastases (CRLM). In a prospective, randomized multi-reader multi-case trial (NCT07027605; Registration Date: January 1, 2025), 12 surgical oncologists assessed 166 retrospective CRLM cases with and without tool assistance in a crossed design with a 5-week washout. The primary endpoint was the difference in AUC for predicting 3-year mortality. Between January and July 2025, 12 readers completed 3984 assessments. Model assistance significantly improved the AUC for 3-year mortality prediction (mean difference 0.091; 95% CI 0.001–0.181; P = 0.048) and consistently improved accuracy across secondary prognostic endpoints. It also reduced decision time (2.53 vs. 3.04 minutes) and increased reader confidence. Benefits were greatest for junior to mid-level surgical oncologists. This exploratory study demonstrates that a machine learning prognostic tool can significantly improve accuracy, efficiency, and confidence in CRLM evaluation.

Mark Helpful

Bookmark

Relay

View Full Paper