Los puntos clave no están disponibles para este artículo en este momento.
Abstract Objective To evaluate potential screening mammography performance and workload impact using a commercial artificial intelligence (AI)–based triage device in a population-based screening sample. Methods In this retrospective study, a sample of 2129 women who underwent screening mammograms were evaluated. The performance of a commercial AI-based triage device was compared with radiologists’ reports, actual outcomes, and national benchmarks using commonly used mammography metrics. Up to 5 years of follow-up examination results were evaluated in cases to establish benignity. The algorithm sorted cases into groups of “suspicious” and “low suspicion.” A theoretical workload reduction was calculated by subtracting cases triaged as “low suspicion” from the sample. Results At the default 93% sensitivity setting, there was significant improvement (P .05) in the following triage simulation mean performance measures compared with actual outcome: 45.5% improvement in recall rate (13.4% to 7.3%; 95% CI, 6.2-8.3), 119% improvement in positive predictive value (PPV) 1 (5.3% to 11.6%; 95% CI, 9.96-13.4), 28.5% improvement in PPV2 (24.6% to 31.6%; 95% CI, 24.8-39.1), 20% improvement in sensitivity (83.3% to 100%; 95% CI, 100-100), and 7.2% improvement in specificity (87.2% to 93.5%; 95% CI, 92.4-94.5). A theoretical 62.5% workload reduction was possible. At the ultrahigh 99% sensitivity setting, a theoretical 27% workload reduction was possible. No cancers were missed by the algorithm at either sensitivity. Conclusion Artificial intelligence–based triage in this simulation demonstrated potential for significant improvement in mammography performance and predicted substantial theoretical workload reduction without any missed cancers.
Watanabe et al. (Sun,) studied this question.