What question did this study set out to answer?

The review aims to evaluate YOLO applications in dental imaging for detection and segmentation performance.

March 2, 2026Open Access

Exploring YOLO applications in dentistry through a systematic review of detection and segmentation models

Key Points

The review aims to evaluate YOLO applications in dental imaging for detection and segmentation performance.
Conducted a PRISMA-compliant literature search across PubMed, Scopus, and Google Scholar.
Included studies focused on YOLO-based detection or segmentation in dental images.
Evaluated methodological quality and risk of bias using QUADAS-2.
Included a total of 73 studies covering areas like caries detection and periodontal assessment.
Performance metrics showed F1-scores ranging from 0.63 to 0.994 and mAP50 from 0.425 to 1.0.
Notable performance issues stemmed from inconsistent metrics reporting and dataset limitations.

Abstract

Abstract Objective Oral diseases remain a major global health burden. Recent advances in artificial intelligence (AI) across medical imaging have encouraged similar developments in dental diagnostics. Within the spectrum of deep learning architectures, the You Only Look Once (YOLO) model has gained attention for its real-time object detection capabilities. This systematic review aims to comprehensively evaluate the scope, diagnostic performance, and methodological quality of YOLO applications in dental imaging. Methods A PRISMA-compliant search of PubMed, Scopus, and Google Scholar (2020–2025) identified studies applying YOLO-based detection or segmentation to dental images. Data extraction covered study characteristics, YOLO versions, datasets, annotation strategies, and performance metrics. Due to the high heterogeneity, a narrative synthesis was conducted. The risk of bias was assessed using the QUADAS-2. Results Seventy-three studies were included across diverse domains, including caries detection, periodontal assessment, lesion recognition, implants, and pediatric dentistry. Reported performance was generally high: F1-scores ranged from 0.63 to 0.994, and mAP50 from 0.425 to 1.0. Metrics reporting was inconsistent, as many studies provided only mAP50 rather than the more comprehensive mAP50-95 (range: 0.272–0.932), which limited comparability. Newer models (YOLOv8–YOLOv11) demonstrated improved sensitivity and multi-class detection, yet were often constrained by small, single-center datasets, reliance on augmentation, and limited external validation. Conclusion YOLO architectures offer strong potential as accurate and efficient diagnostic tools across dental specialties. Nonetheless, their clinical translation is hindered by dataset limitations, inconsistent reporting, and computational demands. Future research should prioritize the use of diverse datasets, standardized evaluation, and multicenter validation. Ultimately, dataset quality and clinical context matter more for performance than the YOLO version.

Bookmark

View Full Paper

Cite This Study

Hartman et al. (Sun,) studied this question.

synapsesocial.com/papers/69a52df3f1e85e5c73bf139b https://doi.org/https://doi.org/10.1007/s44163-026-00930-z

Bookmark

View Full Paper