Key points are not available for this paper at this time.
Artificial intelligence (AI) methods have been developed for automated assessment of ulcerative colitis (UC) endoscopic disease activity from still images and, increasingly, full-length videos. Multiple systems have achieved substantial agreement with expert readers and central reading paradigms for Mayo Endo-scopic Subscore (MES/MCES), Ulcerative Colitis Endoscopic Index of Severity (UCEIS), and related remission definitions, and several approaches have linked endoscopic AI outputs with histology and clinical outcomes such as relapse risk. Video-based pipelines have also been positioned for clinical trial central reading and have included quality gating, temporal aggregation, and, in some cases, real-time use. Continuous and extent-sensitive metrics have been proposed to overcome limitations of “worst-segment” ordinal scoring and have been evaluated in trial contexts. Despite these advances, the literature has left a translation gap between high-performing scoring models and health-informatics requirements for routine clinical deployment: interoperable representation of AI outputs in electronic health records (EHRs), uncertainty-aware escalation and adjudication, auditability, workflow-centered interfaces, and post-deployment governance. This narrative review has synthesized the technical and clinical trajectory of UC endoscopic AI and has proposed an informatics blueprint that has operationalized model outputs as standardized, provenance-rich “evidence bundles” suitable for integration into clinical decision support systems (CDSS) and trial workflows. The blueprint has included (i) a reference architecture for video in-gestion through scoring and longitudinal monitoring; (ii) a minimal data object schema for endoscopic AI readouts (scores, continuous indices, extent maps, quality, uncertainty, explanations, and provenance); (iii) a human–AI interaction pattern for “second-opinion triggering” and adjudication; and (iv) an evalua-tion and governance framework spanning performance, calibration, usability, equity and domain shift monitoring. The proposed blueprint has aimed to support evidence-based adoption while preserving clinical accountability and regulatory feasibility.
Taabzadeh et al. (Mon,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: