What does this research mean for the field?

Although large language models demonstrate expert-level theoretical knowledge in implant dentistry, persistent hallucinations and a lack of visual processing capabilities dictate that they must currently be used only as adjunctive support systems under strict professional oversight. Novelty: ClaimNovelty.SYNTHESIS. Consensus alignment: ConsensusAlignment.NEUTRAL.

What question did this study set out to answer?

This review aims to assess the current applications, performance, and limitations of generative AI in implant dentistry.

June 1, 2026Open Access

Large language models in implant dentistry: a scoping review of applications, performance, and limitations

Key Points

This review aims to assess the current applications, performance, and limitations of generative AI in implant dentistry.
Conducted a scoping review following PRISMA-ScR guidelines with searches in four databases.
Analyzed 18 eligible studies focusing on model performance and comparison to human experts.
Explored diverse applications, including patient interaction, medical knowledge, and diagnostic assessment.
Advanced reasoning models showed improved performance, occasionally exceeding licensed dentists in examinations.
Comparative assessments revealed variability in performance between specialized and general-purpose models.
Concerns include persistent hallucinations and limitations in interpreting diagnostic imaging.

Abstract

BACKGROUND: This scoping review evaluates the current state of generative artificial intelligence (AI) in implant dentistry, focusing on the performance, clinical applications, and inherent limitations of large language models (LLMs) in both clinical and educational settings. METHODS: A scoping review was conducted in accordance with PRISMA-ScR guidelines. A comprehensive search across four electronic databases (PubMed, Web of Science, Scopus, and Embase) was performed for literature published through December 2025. Eighteen eligible studies were analyzed to assess model architectures, specific task performance, and comparative proficiency against human experts. RESULTS: The included studies, all published in 2024 or 2025, were classified as examining patient interaction (n = 9), medical knowledge (n = 7), or diagnostic assessment (n = 2). The analysis revealed a distinct evolution in performance with the advent of advanced reasoning models (e.g., ChatGPT-o1, DeepSeek-R1), which occasionally surpassed licensed dentists in certification examinations. Comparative assessments between medical-specific models and general-purpose LLMs yielded divergent outcomes, indicating that domain specialization does not inherently guarantee superior clinical accuracy against state-of-the-art generalist architectures. Nevertheless, reliability remains a concern; despite the integration of retrieval-augmented generation, hallucinations persist-especially in systematic search tasks-and the inability of text-based models to interpret diagnostic imaging continues to limit their clinical autonomy. CONCLUSIONS: Although generative AI has attained expert-level proficiency in theoretical knowledge retrieval, it currently serves best as an adjunctive support system, rather than a replacement for clinical judgment. Given the persistent risks of hallucination and the lack of visual processing capabilities, strict professional oversight is mandatory. Future research must prioritize the development of multimodal models and validate clinical outcomes through randomized trials.

Bookmark

View Full Paper

Bookmark

View Full Paper

Large language models in implant dentistry: a scoping review of applications, performance, and limitations

Key Points

Abstract

Cite This Study