• Artificial intelligence (AI) models demonstrate consistently high diagnostic performance for intracranial aneurysm rupture status or rupture-related risk assessment across multiple metrics. • Diagnostic performance of human readers varies substantially according to expertise, with expert readers outperforming non-experts in sensitivity but showing similar overall accuracy. • Human–AI collaborative approaches show promising diagnostic performance in individual studies; however, current evidence is limited and highly dependent on integration strategy. • Considerable methodological heterogeneity, inconsistent outcome definitions, and limited head-to-head comparisons restrict definitive conclusions regarding comparative superiority. • Standardized definitions, transparent reporting, and multicenter prospective studies are required to clarify the optimal role of AI and Human–AI collaboration in aneurysm care. Accurate assessment of intracranial aneurysm rupture risk remains challenging, with conventional clinical scores showing limited predictive performance. Artificial intelligence (AI) has emerged as a potential adjunct to imaging-based risk stratification, yet the relative performance of AI, human experts, and Human–AI collaboration has not been systematically quantified. A systematic review and meta-analysis were conducted in accordance with PRISMA guidelines. PubMed, Scopus, Web of Science, and Embase were searched. Studies comparing AI-alone, human-alone, or Human–AI approaches for intracranial aneurysm rupture status or rupture-related risk assessment were included. The outcomes of interest were sensitivity, specificity, accuracy, and the area under the receiver operating characteristic curve (AUC). Random-effects models were applied when appropriate, and methodological quality was assessed using the ROBINS-I tool. Seventeen studies were included. AI-alone models demonstrated consistently high diagnostic performance across sensitivity, accuracy, and AUC, although substantial heterogeneity was observed across methodologies and imaging modalities. Human reader sensitivity varied according to expertise, with expert readers achieving higher sensitivity than non-experts but similar overall accuracy. Human–AI collaborative approaches showed high diagnostic metrics in individual studies; however, the number of eligible studies was limited, and integration strategies were heterogeneous, precluding definitive pooled comparisons. Current evidence indicates that AI models can achieve high diagnostic performance in intracranial aneurysm rupture risk assessment and may serve as a valuable adjunct to clinical expertise. While Human–AI collaboration shows promise, available data remain exploratory and strategy-dependent. Standardized outcome definitions and multicenter prospective studies are required before firm conclusions regarding comparative effectiveness or clinical superiority can be established.
Naghizadeh et al. (Sun,) studied this question.