Background With the rapid development of artificial intelligence (AI) technologies, AI chatbots have been widely applied in the healthcare to provide patients with immediate information. Many people feel embarrassed to discuss gynecomastia in person and turn to online resources for support. Objective This study aims to fill this gap by evaluating the performance of five popular AI chatbots (ChatGPT, DeepSeek, Gemini, Perplexity, and Copilot) in answering questions about gynecomastia, focusing on their reliability, quality, readability, and guideline consistency. Methods In this study, the top 25 gynecomastia-related queries searched globally from 2004 to 2025 were retrieved from Google Trends and input into five AI chatbots for responses. The reliability and quality of responses were assessed using the DISCERN questionnaire and the Ensuring Quality Information for Patients (EQIP) tool. Readability was analyzed via the Flesch-Kincaid Grade Level (FKGL) and Flesch-Kincaid Reading Ease Score (FKRE). Accuracy, supplementary, and incompleteness were compared with the European Association of Andrology guidelines. Results Copilot had the lowest DISCERN score (median interquartile range (IQR): 41.536.0-45.0), while DeepSeek performed best in EQIP scoring (median IQR: 60.459.0-64.1). For readability, ChatGPT exhibited the highest FKGL score (mean ± standard deviation (SD): 15.1 ± 2.0) but the lowest FKRE score (mean ± SD: 15.1 ± 2.0), indicating the poorest readability. In contrast, DeepSeek achieved the lowest FKGL (mean ± SD: 11.0 ± 1.2), suggesting superior readability. Guideline consistency analysis revealed an overall accuracy of 85.71% for AI responses, but key details were often omitted. Conclusion AI chatbots provide immediate informational support for gynecomastia patients, but there is significant variability in readability and reliability, alongside risks of omitting guideline content.
Building similarity graph...
Analyzing shared references across papers
Loading...
Xinran Shao
Ting Ruan
Xingai Ju
Digital Health
China Medical University
First Hospital of China Medical University
Liaoning University of Traditional Chinese Medicine
Building similarity graph...
Analyzing shared references across papers
Loading...
Shao et al. (Thu,) studied this question.
www.synapsesocial.com/papers/68af6203ad7bf08b1eae2ce2 — DOI: https://doi.org/10.1177/20552076251367645
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: