What does this research mean for the field?

Targeted deployment of AI-ECG increased F1-score by 0.25 and reduced false positive screens by 87.8% compared to untargeted screening for structural heart disease. Novelty: ClaimNovelty.NOVEL_FINDING. Consensus alignment: ConsensusAlignment.NEUTRAL.

What question did this study set out to answer?

The study aims to enhance the diagnosis of structural heart disorders (SHD) by utilizing AI algorithms with electronic health records (EHR).

February 8, 2026

Guiding the targeted deployment of AI-ECG for the precision diagnosis of structural heart disorders in the electronic health record

Key Result

Targeted AI-ECG screening for structural heart disease reduced false positives by 87.8% and improved F1-score by 0.25 median compared to untargeted screening.

Key Points

The study aims to enhance the diagnosis of structural heart disorders (SHD) by utilizing AI algorithms with electronic health records (EHR).
Analyzed data from 159,322 individuals within a U.S. health system from 2013-2021.
Developed a targeted screening pipeline using EHR and ECG data to identify 27 SHD types.
Compared targeted AI-ECG deployment against opportunistic deployment in distinct datasets.
Achieved high sensitivity (≥90%) for identifying SHDs with EHR representations.
Targeted deployment resulted in a median 0.25 absolute increase in F1-score.
Observed an 87.8% relative decrease in false positive screens, improving screening accuracy.

Structured PICO

Does targeted deployment of AI-ECG guided by EHR data improve diagnostic performance and reduce false positives for structural heart disease screening compared to untargeted deployment?

Population

159,322 individuals in the development dataset (median age 68, 50.4% women) from a large U.S.-based health system, with validation in a temporally distinct dataset of 5,198 individuals and a geographically distinct test set of 33,518 individuals from the UK Biobank.

Intervention

Targeted deployment of AI-ECG for structural heart disease screening, guided by longitudinal electronic health record (EHR) representations using a foundation model (CLMBR-T) and an ECG image vision transformer (ViT).

Comparator

Untargeted (opportunistic) deployment of AI-ECG across all individuals.

Outcome

Diagnostic performance for detecting 27 structural heart diseases, measured by F1-score, false positive rates, and AUROC.surrogate

Using longitudinal EHR data to guide targeted AI-ECG screening for structural heart disease significantly improves diagnostic precision and reduces false positive rates compared to untargeted screening.

Main Result

Absolute Event Rate: 0% vs 0%

Abstract

Abstract Background Artificial intelligence (AI) can enable scalable screening of structural heart disease (SHD) from routine electrocardiograms (ECGs), yet broad clinical adoption remains limited due to high false positive rates and a lack of targeted deployment strategies. Purpose To develop and validate a multi-modal approach that leverages existing patient data in the electronic health record (EHR) to guide the targeted deployment of AI-ECG for SHD screening. Methods Our development dataset included 159,322 individuals (median age 68 IQR: 57-78 years, 50.4% women) across a large U.S.-based health system between (2013-2021), with 118 million coded EHR events, as well as 754,533 pairs of temporally linked ECG images and echocardiography reports (≤90 days between studies). We designed an algorithmic pipeline that leverages joint EHR and ECG embeddings for targeted SHD screening. First, we used a validated foundation model (CLMBR-T) to produce longitudinal EHR representations, which we fine-tuned to identify signatures of 27 SHDs, optimized to a sensitivity of ≥90%. This EHR representation represents the population with a high pre-test probability of these SHDs (Fig. 1a). Next, we used an ECG image vision transformer (ViT) developed using contrastive language-image pre-training against linked echocardiogram reports, optimized to detect SHDs with a balance of precision and recall (Fig. 1b). This targeted (sequential) screening approach was compared against an untargeted approach of opportunistically deploying AI-ECG across all individuals in i) a temporally distinct dataset of 5,198 individuals who had their first TTE in 2022-2023 within 90 days after their ECG, and ii) in a geographically distinct test set of 33,518 individuals enrolled in the UK Biobank with concurrent ECG and cardiac magnetic resonance imaging. Results Using training embeddings as reference, our foundational ECG image encoder successfully discriminated a representative sample of 27 SHD labels during testing, including left ventricular systolic dysfunction (AUROC of 0.90), and severe aortic stenosis (AUROC 0.85), among others (Fig. 2a). Targeted (vs opportunistic) deployment of AI-ECG led to a median 0.25 IQR: 0.19-0.43 absolute increase in F1-score, which corresponded to a 87.8% IQR: 82.4%-98.2% relative decrease in false positive screens across the population (Fig. 2b). These gains were replicated in a geographically distinct test set, with a 36.8% median IQR 16.3%-54.4% relative increase in F1-scores, when examining AI-ECG performance in an enriched versus untargeted ("screen-all") population. Conclusion Using a deep learning representation of existing longitudinal EHR data can guide the efficient use of AI to screen for SHD from routinely available ECG images, thus minimizing false discovery and the possibility of unnecessary downstream testing.Figure 1. Figure 2.

Bookmark