Key points are not available for this paper at this time.
Objective: To compare the accuracy and consistency of five large language models (LLMs) in generating responses about dental trauma. Materials and methods: = 0.05), alongside calculation of sensitivity, specificity, accuracy, and area under the ROC curve (AUC) based on the 60-item set. Temporal stability was assessed using the intraclass correlation coefficient ICC. Results: 0.90). Conclusion: All evaluated LLMs, particularly Copilot and DeepSeek, demonstrated high accuracy in providing information on dental trauma, with stable performance over time. While the use of a context prompt did not significantly affect accuracy or stability.
Building similarity graph...
Analyzing shared references across papers
Loading...
Rafaela Mancini Lisboa
Arian Braido
Adriana de Jesus Soares
Frontiers in Oral Health
Universidade Estadual de Campinas (UNICAMP)
All India Institute of Medical Sciences
Universidade Federal de Uberlândia
Building similarity graph...
Analyzing shared references across papers
Loading...
Lisboa et al. (Mon,) studied this question.
synapsesocial.com/papers/6a0889d0df3db87398109ea3 — DOI: https://doi.org/10.3389/froh.2025.1737114