What question did this study set out to answer?

This research aims to utilize transformer-based language models to convert unstructured cardiac reports into structured data, enhancing data accessibility for research and clinical care.

March 28, 2026Open Access

Converting unstructured cardiac catheterization and echocardiography reports into structured data using transformer-based language models

Q: What does this research mean for the field?

Locally deployed, fine-tuned transformer-based language models can accurately extract structured clinical data from unstructured echocardiography and cardiac catheterization reports, achieving over 90% accuracy, precision, and recall. Novelty: ClaimNovelty.METHODOLOGICAL. Consensus alignment: ConsensusAlignment.NEUTRAL.

Key Result

Locally run transformer-based language models extracted structured data from unstructured echocardiography and cardiac catheterization reports with mean accuracies of 95.7% and 94.9%, respectively.

Key Points

This research aims to utilize transformer-based language models to convert unstructured cardiac reports into structured data, enhancing data accessibility for research and clinical care.
Fine-tuned BioclinicalBERT and BART-Large-CNN in a secure local environment.
Dataset included 3286 echocardiography and 1884 cardiac catheterization reports, annotated for 25 and 47 categories, respectively.
Utilized a question-answering approach for model training and evaluation.
Both models showed high performance with accuracy, precision, and recall above 90%.
BioclinicalBERT achieved mean accuracy of 95.7% for echocardiography reports.
BART-Large-CNN slightly outperformed BioclinicalBERT on cardiac catheterization with accuracy of 94.9%.
Performance improved with increased training data, plateauing around 1000 reports.

Structured PICO

Do fine-tuned transformer-based language models accurately extract structured data from unstructured echocardiography and cardiac catheterization reports?

Population

3286 echocardiography and 1884 cardiac catheterization reports from Kaiser Permanente Southern California's electronic health records

Intervention

Fine-tuned transformer-based language models (BioclinicalBERT and BART-Large-CNN) run locally using a question-answering approach

Outcome

Model performance assessed using accuracy, precision, recall, and F1-score at 2 probability thresholds

Locally run, fine-tuned transformer-based language models can accurately extract structured data from unstructured echocardiography and cardiac catheterization reports, offering a privacy-preserving alternative to external APIs.

Main Result

Absolute Event Rate: 0% vs 0%

Abstract

Abstract Objectives Echocardiography and cardiac catheterization reports capture important clinical assessment information of cardiac function and disease severity. This study explores using open-source transformer-based language models (LMs) that are run locally within an institutional environment as a privacy-preserving alternative to external API-based large LM to systematically extract clinical data from unstructured echocardiography and cardiac catheterization reports, aiming to improve data accessibility for research and patient care. Materials and Methods Two transformer-based LMs, BioclinicalBERT and BART-Large-CNN, were fine-tuned in a secure local environment using a question-answering approach. The dataset included 3286 echocardiography and 1884 cardiac catheterization reports from Kaiser Permanente Southern California’s electronic health records, annotated for 25 and 47 predefined categories, respectively. Three hundred reports from each type were randomly selected and used for validation, with the remainder for training. Model performance was assessed using accuracy, precision, recall, and F1-score at 2 probability thresholds. The effect of training set size on model performance was also evaluated. Results Both models achieved consistent and high accuracy, precision, and recall (all 90%) across the 5 seed runs for both report types. For echocardiography, BioclinicalBERT reached mean accuracy of 95.7%, precision of 97.6%, recall of 97.4%, and F1-score of 0.98 at the probability threshold of 0.1; BART-Large-CNN had similar results. For cardiac catheterization, BART-Large-CNN slightly outperformed BioclinicalBERT with mean accuracy 94.9% vs 94.3%; precision 96.7% vs 96.3%; recall 96.1% vs 95.7%, and F1-score 0.96 vs 0.96 at the probability threshold of 0.1. Most individual categories showed strong performance, though a few (eg, prosthetic mitral valve, right atrial pressure) had lower scores. Performance improved with more training data, but plateauing around 1000 reports. Discussion and conclusion Fine-tuned transformer-based LMs can effectively extract structured data from unstructured cardiac reports, supporting automated information extraction to enhance research and clinical applications.