Key points are not available for this paper at this time.
You have accessJournal of UrologyUrodynamics/Lower Urinary Tract Dysfunction/Female Pelvic Medicine: Male Incontinence (MP03)1 May 2024MP03-15 HOW RELIABLE IS CHATGPT? BENCHMARKING AGAINST THE AUA/SUFU GUIDELINE ON POSTPROSTATECTOMY URINARY INCONTINENCE Matheus F. de Azevedo, Vicktor B.P. Pinto, Marcelo L. Wroclawski, Guilherme Gentile, Vinicius L.M. de Jesus, Isabela C. Barros, Jose de Bessa, Homero Bruschini, William C. Nahas, Carlos A.R. Sacomani, Jaspreet S. Sandhu, and Cristiano M. Gomes Matheus F. de AzevedoMatheus F. de Azevedo , Vicktor B.P. PintoVicktor B.P. Pinto , Marcelo L. WroclawskiMarcelo L. Wroclawski , Guilherme GentileGuilherme Gentile , Vinicius L.M. de JesusVinicius L.M. de Jesus , Isabela C. BarrosIsabela C. Barros , Jose de BessaJose de Bessa , Homero BruschiniHomero Bruschini , William C. NahasWilliam C. Nahas , Carlos A.R. SacomaniCarlos A.R. Sacomani , Jaspreet S. SandhuJaspreet S. Sandhu , and Cristiano M. GomesCristiano M. Gomes View All Author Informationhttps://doi.org/10.1097/01.JU.0001009488.55564.85.15AboutPDF ToolsAdd to favoritesDownload CitationsTrack CitationsPermissionsReprints ShareFacebookLinked InTwitterEmail Abstract INTRODUCTION AND OBJECTIVE: The novel Artificial Intelligence (AI) ChatGPT has been revolutionizing the way research is conducted and is increasingly being used as a source of information by healthcare professionals. We aimed to evaluate the accuracy of the information generated by Chat GPT 3.5 (free) and ChatGPT 4 (subscription-based) regarding the assessment and treatment of postprostatectomy urinary incontinence (PPUI). METHODS: A total of 20 questions were prepared by urologists with expertise in PPUI. The questions had uncontroversial answers based on the Incontinence after Prostate Treatment: AUA/SUFU GUIDELINE. Ten were conceptual questions and ten were based on clinical cases, to evaluate ChatGPT's ability to apply knowledge and critical thinking skills. All questions were submitted in English, anonymously (without IP identification), separately, to versions 3.5 and 4 of ChatGPT. The engine was prompted to be specific and limit the answers to 200 words for greater objectivity and was not prompted to incorporate any specific guideline. Each question was entered as a separate, independent prompt using the "New Chat" function. AI generated answers were independently analyzed by the experts who provided the questions. The accuracy of each response was graded as (A) Correct (1 point); (B) partially correct (0.5 point); or (C) Incorrect (0 point). RESULTS: Chat GPT 3.5 had an accuracy of 65% in conceptual questions (5 correct answers, 3 partially correct and 2 incorrect) and 50% on clinical cases questions (5 correct answers and 5 incorrect). ChatGPT 4 had an accuracy of 90% in both conceptual (8 correct answers and 2 partially correct) and clinical scenario questions (9 correct answers and 1 incorrect). The Table 1 shows examples of performance differences between the two versions of the AI. CONCLUSIONS: ChatGPT has great potential to generate information in the healthcare field. However, critical assessment of its responses is essential given the potential error rate, particularly in the freely available 3.5 version. Version 4 demonstrated superior accuracy, performing well even with clinical case-based questions. Future studies should explore the role of these evolving technologies in enhancing education and healthcare practices. Source of Funding: None © 2024 by American Urological Association Education and Research, Inc.FiguresReferencesRelatedDetails Volume 211Issue 5SMay 2024Page: e28 Advertisement Copyright & Permissions© 2024 by American Urological Association Education and Research, Inc.Metrics Author Information Matheus F. de Azevedo More articles by this author Vicktor B.P. Pinto More articles by this author Marcelo L. Wroclawski More articles by this author Guilherme Gentile More articles by this author Vinicius L.M. de Jesus More articles by this author Isabela C. Barros More articles by this author Jose de Bessa More articles by this author Homero Bruschini More articles by this author William C. Nahas More articles by this author Carlos A.R. Sacomani More articles by this author Jaspreet S. Sandhu More articles by this author Cristiano M. Gomes More articles by this author Expand All Advertisement PDF downloadLoading ...
Azevedo et al. (Mon,) studied this question.