What is the clinical evidence from this study?

Study design: Cross-Sectional. Population: Cardiovascular diseases (n=21). Intervention: ChatGPT (Large Language Model) vs. Standard clinical assessments by experienced cardiologists. Primary outcome: Mean total score for report generation, diagnostic precision, and recommendations.

March 31, 2025Open Access

Automated generation of echocardiography reports using artificial intelligence: a novel approach to streamlining cardiovascular diagnostics

Key Result

ChatGPT generated fully acceptable echocardiography reports in 85.7% of cases, with a mean total score of 6.86 and only 5.3% of parameters misinterpreted compared to expert cardiologists.

Study Design

Type

Cross-Sectional (n=21)

Structured PICO

Does an LLM (ChatGPT) accurately generate echocardiography reports and clinical recommendations compared to standard clinical assessments?

Population

n=21 echocardiographic cases (13 fictional, 8 clinical)

Intervention

Large language model (ChatGPT) for automated generation of echocardiography reports and clinical recommendations

Comparator

Standard clinical assessments conducted by experienced cardiologists

Outcome

Accuracy in report generation, diagnostic precision, and appropriateness of recommendations evaluated using a dedicated scoring systemsurrogate

ChatGPT demonstrates high accuracy in generating echocardiography reports and clinical recommendations, suggesting potential utility in streamlining clinical workflows.

Abstract

Accurate interpretation of echocardiography measurements is essential for diagnosing cardiovascular diseases and guiding clinical management. The emergence of large language models (LLMs) like ChatGPT presents a novel opportunity to automate the generation of echocardiography reports and provide clinical recommendations. This study aimed to evaluate the ability of an LLM (ChatGPT) to 1) generate comprehensive echocardiography reports based solely on provided echocardiographic measurements, and when enriched with clinical information 2) formulate accurate diagnoses, along with appropriate recommendations for further tests, treatment, and follow-up. Echocardiographic data from n = 13 fictional cases (Group 1) and n = 8 clinical cases (Group 2) were input into the LLM. The model's outputs were compared against standard clinical assessments conducted by experienced cardiologists. Using a dedicated scoring system, the LLM's performance was evaluated and stratified based on its accuracy in report generation, diagnostic precision, and the appropriateness of its recommendations. Patterns, frequency and examples of misinterpretations by LLM were analysed. Across all cases, mean total score was 6.86 (SD = 1.12). Group 1 had a mean total score of 6.54 (SD = 1.13) and accuracy of 3.92 (SD = 0.86), while Group 2 scored 7.38 (SD = 0.92) and 4.38 (SD = 0.92), respectively. Recommendations were 2.62 (SD = 0.51) for Group 1 and 3.00 (SD = 0.00) for Group 2, with no significant differences (p = 0.096). Fully acceptable reports were 85.7%, borderline acceptable 14.3%, and none were not acceptable. Of 299 parameters, 5.3% were misinterpreted. The LLM demonstrated a high level of accuracy in generating detailed echocardiography reports, mostly correctly identifying normal and abnormal findings, and making accurate diagnoses across a range of cardiovascular conditions. ChatGPT, as an LLM, shows significant potential in automating the interpretation of echocardiographic data, offering accurate diagnostic insights and clinical recommendations. These findings suggest that LLMs could serve as valuable tools in clinical practice, assisting and streamlining clinical workflow.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Finn Syryca

Christian Gräßer

Deutsches Herzzentrum München

Teresa Trenkwalder

Structural Heart Disease

Journals

The International Journal of Cardiovascular Imaging

Actions

Institutions

Deutsches Herzzentrum München

Klinik und Poliklinik für Nuklearmedizin

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Automated generation of echocardiography reports using artificial intelligence: a novel approach to streamlining cardiovascular diagnostics

Key Result

Study Design

Structured PICO

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study