What type of study is this?

September 5, 2025Open Access

Evaluation of AI-based chatbot responses to orthodontic retention-related patient questions: a comparative analysis

Key Points

Performance among AI chatbots varies significantly in delivering orthodontic retention guidance, affecting patient communication.
ChatGPT-4 generated the most original content and had the best readability, offering unique responses for patients.
MediSearch and OpenEvidence outperformed others in EQIP and Reliability scores, providing higher accuracy and reliability.
Statistically significant differences were found between platforms in EQIP, SMOG, and Similarity Index metrics, suggesting diverse capabilities.

Abstract

Aims: This study aimed to evaluate and compare the quality, accuracy, and readability of orthodontic retention-related information provided by five AI-based chatbot platforms: ChatGPT-4, Copilot, Gemini, MediSearch, and OpenEvidence. Methods: A set of 25 commonly asked patient-oriented questions on orthodontic retention was submitted to each chatbot. Responses were evaluated using five key metrics: EQIP (Ensuring Quality Information for Patients), Reliability Score, Global Quality Score (GQS), SMOG Readability Index, and Similarity Index. Mean±standard deviation values were calculated. Kruskal-Wallis and Dunn’s post-hoc tests were used to assess statistical differences. Results: MediSearch and OpenEvidence outperformed others in EQIP and Reliability scores. ChatGPT-4 generated the most original content with the lowest Similarity Index. SMOG readability was significantly better for ChatGPT-4, while MediSearch and OpenEvidence produced more technically complex language. Statistically significant differences (p<0.05) were found between platforms in EQIP, SMOG, and Similarity Index metrics. Conclusion: Performance among AI chatbots varies significantly in delivering orthodontic retention guidance. While medical-specific tools offer superior accuracy and reliability, general-purpose models like ChatGPT-4 excel in readability and originality. The results highlight the importance of matching AI platform selection with patient communication goals.

Evaluation of AI-based chatbot responses to orthodontic retention-related patient questions: a comparative analysis

Key Points

Abstract

Cite This Study