What question did this study set out to answer?

This review assesses the diagnostic accuracy and clinical outcomes of AI systems versus standard practice.

July 2, 2026Open Access

Artificial Intelligence in Clinical Decision-Making: A Systematic Review of Diagnostic Accuracy, Predictive Performance, and Clinical Outcomes

Puntos clave

This review assesses the diagnostic accuracy and clinical outcomes of AI systems versus standard practice.
Comprehensive literature search in PubMed, Embase, Scopus, and Cochrane Library.
Followed PRISMA 2020 guidelines, including six studies with approx. 1.1 million patients.
Performed narrative synthesis due to heterogeneity and assessed bias using QUADAS-2 and ROBINS-I.
AI models showed AUC values between 0.85 to 0.96, sensitivity up to 97%, and specificity up to 93%.
Performance was particularly strong in radiology and dermatology, often superior to clinicians.
ICU predictive models exhibited more variability, highlighting challenges in consistency.

Resumen

Artificial intelligence (AI) is increasingly used in clinical decision-making to improve diagnostic accuracy, predictive performance, and treatment planning across multiple specialties. This systematic review evaluated the accuracy and clinical outcomes of AI-based systems compared with standard clinical practice. A comprehensive literature search was conducted in PubMed, Embase, Scopus, and Cochrane Library following Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) 2020 guidelines, and six studies with a combined sample size of approximately 1.1 million patients and imaging datasets were included. Due to substantial heterogeneity in study populations, AI models, clinical settings, and outcome measures, a narrative synthesis was performed, and risk of bias was assessed using Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2) and Risk Of Bias In Non-randomised Studies of Interventions (ROBINS-I) tools. Overall, AI models demonstrated strong performance with AUC values ranging from 0.85 to 0.96, sensitivity up to 97%, and specificity up to 93%, particularly in radiology and dermatology, where performance was comparable or superior to that of clinicians. However, ICU-based predictive models showed more variability. In conclusion, AI demonstrates promising diagnostic and predictive accuracy, although the evidence is predominantly derived from retrospective studies requiring prospective validation, highlighting the need for prospective multicentre trials before routine clinical implementation.

Preguntar a la IA

Me gusta

Guardar

Ver artículo completo