Objective This study assesses ChatGPT-4o′s responses to common patient inquiries regarding urinary incontinence (UI), a condition that significantly impacts quality of life but often goes untreated due to low healthcare-seeking behavior. The evaluation focuses on four key metrics: understandability, actionability, reliability, and readability. Material and Methods In this non-human subject qualitative study, 13 patient-focused questions—derived from AUA/SUFU and EAU guidelines—were posed to ChatGPT-4o in Turkish. The questions were categorized into four themes: Definition, Diagnosis, Management, and Surgical Considerations. Three blinded experts (an urogynecologist, a urologist, and a pelvic floor physiotherapist) independently evaluated the responses using the Patient Education Materials Assessment Tool (PEMAT) for understandability and actionability and the modified DISCERN (mDISCERN) tool for reliability. Readability was measured using the Çetinkaya–Uzun formula , specifically designed for Turkish text. Statistical analysis included descriptive statistics and the Intraclass Correlation Coefficient (ICC) to determine inter-rater reliability. Results In evaluating ChatGPT-4o’s performance in urinary incontinence education, experts found strong agreement in their assessments, with inter-rater reliability scores were 0.80 (95% CI: 0.70-0.91) for PEMAT and 0.82 (95% CI: 0.70-0.91) for mDISCERN. The AI’s responses were consistently highly understandable, particularly when explaining diagnoses (achieving a peak score of 94.4 %), yet they were significantly less actionable, meaning they often failed to provide clear, practical steps for patients to follow. This gap was most evident in surgical considerations, which were deemed the least actionable at 68.2 %. The overall reliability of the content was rated as “fair” across all categories—with surgical information being the most reliable. Most responses were classified as “difficult,” requiring a university-level education to comprehend, with surgery-related topics being the most linguistically complex. Conclusion While ChatGPT-4o yields comprehensible health information, its limited actionability and high linguistic complexity pose barriers to patients with lower health literacy.
Building similarity graph...
Analyzing shared references across papers
Loading...
Ayşe Filiz Gökmen Karasu
Bezmiâlem Vakıf Üniversitesi
Betul Cinar
Bezmiâlem Vakıf Üniversitesi
Melda Kuyucu
Izmir University
Digital Health
Bingöl University
Bezmiâlem Vakıf Üniversitesi
State Hospital
Building similarity graph...
Analyzing shared references across papers
Loading...
Karasu et al. (Sun,) studied this question.
synapsesocial.com/papers/6a2901886f82f25be989dda7 — DOI: https://doi.org/10.1177/20552076261459527