What question did this study set out to answer?

This review aims to explore the applications of AI in orthopedic trauma surgery and identify research gaps.

March 31, 2026Open Access

Artificial intelligence in orthopedic trauma surgery: a scoping review of current applications and research gaps

Key Points

This review aims to explore the applications of AI in orthopedic trauma surgery and identify research gaps.
Conducted a PRISMA-SCR compliant scoping review
Systematic search across Semantic Scholar, OpenAlex, and PubMed
Included studies on AI applications in orthopedic trauma with at least 10 subjects
Synthesized data on AI methodology, clinical applications, validation strategies, and performance metrics.
146 studies included in the review
Deep learning was predominant in fracture detection (61% of studies)
85% of studies reported internal validation, but only 15% included external validation
High technical accuracy found in diagnostic models (AUC 0.90–1.00)
Underreported explainability with only 24% employing analysis methods like Grad-CAM.

Abstract

Artificial intelligence (AI) is rapidly transforming clinical decision-making, yet its role in orthopedic trauma surgery remains fragmented and unevenly validated in clinical practise. We conducted a PRISMA-SCR–compliant scoping review using a systematic search of the Semantic Scholar, OpenAlex and PubMed corpus via Elicit (497 records). Studies were eligible if they applied AI or machine-learning methods to traumatic orthopedic conditions, included ≥ 10 human subjects, reported quantitative performance metrics, and represented original research. After title/abstract and full-text screening, 146 studies were included. Data on study characteristics, AI methodology, clinical application, validation strategy, performance metrics, explainability and translational maturity were synthesized descriptively. Research output increased sharply after 2017, with 52% of all studies published since 2022. Most studies were retrospective (≈ 99%). Deep learning dominated the field (61%), particularly for fracture detection and classification, while classical machine-learning models were mainly used for outcome prediction. Internal validation was reported in 85% of studies, whereas only 15% clearly performed external or multicenter validation; true prospective clinical testing was rare (1.4%), and only a small subset of models had been implemented in practice (3.4%). Diagnostic models frequently achieved very high technical accuracy (AUC 0.90–1.00 in constrained tasks), while prognostic models showed moderate-to-high performance (AUC 0.75–0.95). Explainability was underreported, only 24% used any form of saliency mapping, Grad-CAM or feature importance analysis. AI in orthopedic trauma surgery demonstrates strong technical feasibility but remains overwhelmingly at the proof-of-concept stage. The field is characterized by limited external validation, minimal prospective evidence, scarce explainability, and insufficient workflow integration, factors that collectively hinder clinical translation. To bridge the gap from laboratory performance to real-world impact, future research must emphasize multicenter datasets, rigorous external and prospective validation, explainable AI, and user-centered implementation studies. AI has the potential to augment, rather than replace, orthopedic trauma care, but its safe and effective adoption requires substantial methodological maturation.

Bookmark

View Full Paper

Cite This Study

Wurm et al. (Sat,) studied this question.

synapsesocial.com/papers/69cb645fe6a8c024954b8a98 https://doi.org/https://doi.org/10.1007/s00402-026-06276-6

Bookmark

View Full Paper