November 24, 2025Open Access

The evaluation of tooth whitening from a perspective of artificial intelligence: a comparative analytical study

Key Points

Key points are not available for this paper at this time.

Abstract

Background Artificial intelligence (AI) chatbots are increasingly consulted for dental aesthetics information. This study evaluated the performance of multiple large language models (LLMs) in answering patient questions about tooth whitening. Methods 109 patient-derived questions, categorized into five clinical domains, were submitted to four LLMs: ChatGPT-4o, Google Gemini, DeepSeek R1, and DentalGPT. Two calibrated specialists evaluated responses for usefulness, quality (Global Quality Scale), reliability (CLEAR tool), and readability (Flesch-Kincaid Reading Ease, SMOG index). Results The models generated consistently high-quality information. Most responses (68%) were “very useful” (mean score: 1.24 ± 0.3). Quality (mean GQS: 3.9 ± 2.0) and reliability (mean CLEAR: 22.5 ± 2.4) were high, with no significant differences between models or domains ( p 0.05). However, readability was a major limitation, with a mean FRE score of 36.3 (“difficult” level) and a SMOG index of 11.0, requiring a high school reading level. Conclusions Contemporary LLMs provide useful and reliable information on tooth whitening but deliver it at a reading level incompatible with average patient health literacy. To be effective patient education adjuncts, future AI development must prioritize readability simplification alongside informational accuracy.

The evaluation of tooth whitening from a perspective of artificial intelligence: a comparative analytical study

Key Points

Abstract

Cite This Study

Also Consider

Also Consider