To evaluate whether AI can substitute for the first reader in a double-reading workflow for lung-cancer detection on screening chest radiographs. A retrospective analysis was conducted in a screening cohort at Ishikawa Health Service Association that included 155,503 participants undergoing 320,329 examinations between January 2018 and September 2020. From examinations initially identified as suspected lung cancer by the conventional double-reading system (n = 2,882), prespecified exclusions were applied, yielding 1,847 examinations for detection-performance analysis. AI-based lesion detection was retrospectively performed using three AI models, and the localization accuracy of the AI outputs was evaluated. Detection performance (AI vs. first readers) was compared using McNemar’s test with a non-inferiority margin of − 0.05 (AI deemed non-inferior if the lower bound of the 95% CI exceeded − 0.05) in two settings: (1) all lesions and (2) pulmonary nodule/mass only. The false-positive rate per examination was estimated using 5,784 normal examinations (5,689 participants) performed between January and June 2018 with ≥ 2-year negative follow-up. For all abnormalities, each AI model met the non-inferiority criterion relative to first readers and showed higher detection rates (AI detection, 62.5–77.3%; first readers, 59.3%). Similar findings were observed when the analysis was limited to nodule/mass only (AI, 64.5–76.5%; first readers, 59.2%). False-positive frequencies per examination were 0.081 (Software A), 0.065 (Software B), and 0.147 (Software C), versus 0.002 for first readers. In a retrospective screening cohort, three AI models achieved non-inferior, overall higher detection performance compared with first readers for suspected lung cancer on chest radiographs. Despite higher false-positive rates, AI could feasibly assume the first-reader role within a conventional double-reading workflow while maintaining diagnostic quality. Prospective, multi-center studies are warranted to confirm effectiveness, quantify workflow impact, and assess downstream consequences of AI-assisted single reading.
Yoshida et al. (Sat,) studied this question.