May 17, 2026Open Access

Evaluating retrieval-augmented generation for guideline-grounded textual planning in implant dentistry: A comparative study

Puntos clave

Los puntos clave no están disponibles para este artículo en este momento.

Resumen

OBJECTIVES: To evaluate whether a retrieval-augmented generation (RAG) framework can enhance citation precision and improve the clinical reliability of large language models (LLMs) in implant dentistry. METHODS: A domain-specific knowledge base was constructed using major international consensus reports and established clinical treatment guides. Forty standardized clinical vignettes (35 real-world, 5 synthetic) were each processed twice by both a standard LLM (GPT-4o) and a RAG-LLM, yielding 160 treatment plans. Intra-model agreement was quantified using the weighted Cohen's kappa. Two independent experts evaluated the plans across five clinical domains using a double-blinded, consensus-driven protocol. RESULTS: Overall clinical accuracy showed no significant difference between models (P = 0.642). However, domain-specific analysis revealed that the RAG-LLM significantly outperformed the standard model in evidence traceability and citation precision (P = 0.046). Neither model exhibited severe literature fabrication. Conversely, the standard LLM demonstrated superior biomechanical spatial planning (P = 0.025). The RAG architecture exhibited "retrieval bias" in soft-tissue complication scenarios, leading to inappropriate hard-tissue augmentation. Sensitivity analysis revealed the RAG model's superiority diminished in out-of-distribution synthetic cases lacking thematic overlap with the knowledge base. CONCLUSIONS: Implementing RAG significantly improves evidence traceability, enhances citation precision, and demonstrates high intra-model consistency, with severe diagnostic shifts being exceedingly rare. However, text-based retrieval introduces diagnostic biases causing overtreatment, and its superiority relies heavily on conceptual overlap with the knowledge base. RAG-LLMs are promising adjunctive tools but require continuous expert oversight to compensate for spatial reasoning limitations. CLINICAL SIGNIFICANCE: This study demonstrates that retrieval-augmented generation (RAG) significantly enhances citation precision in implant dentistry by anchoring models to established clinical guidelines. While RAG enhances evidence traceability, clinicians must remain cautious of retrieval bias. This bias can lead to algorithmic overtreatment in soft-tissue complication management. Currently, RAG-enhanced AI serves as a valuable adjunctive tool but cannot replace the three-dimensional spatial reasoning of an experienced implantologist.

Leer artículo completoexternamente

Me gusta

Guardar

Ver artículo completo