August 15, 2025Open Access

EchoLLM: extracting echocardiogram entities with light-weight, open-source large language models

Key Points

Gemma2:9b-instruct achieved the highest precision, recall, and F1 scores, indicating strong performance in extracting echocardiogram findings.
Gemma2:9b-instruct scored 0.973 for precision, 0.959 for recall, and 0.965 for F1, outperforming other models.
Analysis utilized 14 open-source LLMs to extract entities from 507 echocardiogram reports, providing comprehensive evaluations.
Using open-source LLMs for echocardiogram entity extraction may enhance clinical research and healthcare delivery efficiency.

Abstract

Abstract Objectives Large language models (LLMs) have demonstrated high levels of performance in clinical information extraction compared to rule-based systems and traditional machine-learning approaches, offering scalability, contextualization, and easier deployment. However, most studies rely on proprietary models with privacy concerns and high costs, limiting accessibility. We aim to evaluate 14 publicly available open-source LLMs for extracting clinically relevant findings from free-text echocardiogram reports and examine the feasibility of their implementation in information extraction workflows. Materials and Methods We used 14 open-source LLM models to extract clinically relevant entities from echocardiogram reports (n = 507). Each report was manually annotated by 2 independent health-care professionals and adjudicated by a third. Lexical variance and length of each echocardiogram report were collected. Precision, recall, and F1 scores were calculated for the 9 extracted entities via multiclass classification. Results In aggregate, Gemma2:9b-instruct had the highest precision, recall, and F1 scores at 0.973 (0.962-0.983), 0.959 (0.947-0.973), and 0.965 (0.951-0.975), respectively. In comparison, Phi3:3.8b-mini-instruct had the lowest precision score at 0.831 (0.804-0.856), while Gemma:7b-instruct had the lowest recall and F1 scores at 0.382 (0.356-0.408) and 0.392 (0.356-0.428), respectively. Discussion and Conclusion Using LLMs for entity extraction for echocardiogram reports has the potential to support both clinical research and health-care delivery. Our work demonstrates the feasibility of using open-source models for more efficient computation and extraction.

EchoLLM: extracting echocardiogram entities with light-weight, open-source large language models

Key Points

Abstract

Cite This Study

Also Consider

Also Consider