Motivation: Motivated by the need for precise breast cancer diagnosis, this study investigates whether LLMs, like ChatGPT, can optimize MRI reporting in line with BI-RADS guidelines. Goal(s): We aimed to convert clinical reports into structured data and evaluate the feasibility for clinical application. Approach: Using OpenAI's text-davinci-003 model on reports from 237 patients, we found Kappa values indicated variable agreement. Results: The high sensitivity of the model suggests effective capture of positive features. ROC analysis showed that GPT's diagnostic performance was comparable to physician-annotated tests, particularly for HR and HER2. Impact: This study highlights the promising role of AI in radiology, potentially enhancing diagnostic accuracy and supplementing radiological expertise, especially in multilingual and resource-limited settings.
Song et al. (Tue,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: