What does this research mean for the field?

Arrhythmias significantly reduce the specificity of a deep learning model for predicting echocardiographic abnormalities, with AUROC dropping to 0.68 for atrial fibrillation and 0.62 for pacemaker rhythm. Novelty: ClaimNovelty.NOVEL_FINDING. Consensus alignment: ConsensusAlignment.CHALLENGES_CONSENSUS.

What question did this study set out to answer?

The aim is to evaluate how arrhythmias affect the accuracy of a deep learning model predicting echocardiographic abnormalities.

February 8, 2026

Impact of arrhythmias on the performance of a deep learning model for predicting echocardiographic abnormalities from electrocardiograms

Resultado clave

Deep learning model AUROC for predicting echocardiographic abnormalities dropped from 0.78 (no arrhythmia) to 0.68 in AF and 0.62 in pacemaker rhythm, with 100% sensitivity but 0% specificity.

Puntos clave

The aim is to evaluate how arrhythmias affect the accuracy of a deep learning model predicting echocardiographic abnormalities.
Utilized 229,439 ECG and echocardiography datasets from eight centers.
Focused on a subset with arrhythmia-labeled data of 29,411 pairs.
Trained DL models to predict 12 echocardiographic findings.
Applied logistic regression for composite labeling.
Assessed model performance metrics across various arrhythmia subgroups.
Composite label achieved AUROC of 0.78 without arrhythmias, dropping to 0.75 with any arrhythmia.
AF and PM significantly reduced model performance, with AUROCs of 0.68 and 0.62, respectively.
PVC had less impact, achieving an AUROC of 0.80.
Confusion matrix indicated 100% sensitivity and zero specificity in AF and PM cases.
High sensitivity contrasts with low specificity, limiting clinical utility.

PICO estructurado

Does the presence of arrhythmias reduce the predictive accuracy of a deep learning model for echocardiographic abnormalities in patients undergoing ECG?

Población

29,411 paired ECG-echocardiography datasets from one validation center with arrhythmia-labeled data (subset of a larger 229,439 dataset from 8 centers)

Intervención

Deep learning-based model for predicting 12 echocardiographic findings related to heart failure from ECGs in the presence of arrhythmias (atrial fibrillation, premature ventricular contractions, and pacemaker rhythm)

Comparador

Deep learning model performance in patients without arrhythmias

Resultado

Model performance metrics including area under the receiver-operating characteristic curve (AUROC), sensitivity, and specificity for a composite label of 12 echocardiographic findingssurrogate

Arrhythmias, particularly atrial fibrillation and pacemaker rhythms, significantly degrade the specificity of deep learning models used to predict echocardiographic abnormalities from ECGs, limiting their clinical utility.

Resultado numérico

Tasa de eventos absoluta: 0% vs 0%

Resumen

Abstract Background Heart failure requires early diagnosis, and targeted screening could improve if functional and morphological cardiac abnormalities were detectable on an electrocardiogram (ECG). While deep learning (DL) models have shown promise in predicting echocardiographic findings, their performance in patients with arrhythmias remains underexplored. Purpose This study aimed to evaluate the impact of arrhythmias, including atrial fibrillation (AF), premature ventricular contractions (PVC), and pacemaker rhythm (PM), on the ability of a DL-based model to predict echocardiographic abnormalities. By assessing how arrhythmias influence model accuracy, we sought to identify potential limitations and areas for improvement in artificial intelligence (AI)-assisted cardiac screening. Methods We utilized 229,439 paired ECG and echocardiography datasets from eight centers—six for model development and two for external validation. In previous analyses, external validation was conducted using data from both centers. In this study, we focused on a subset of one validation center that provided arrhythmia-labeled data, consisting of 29,411 ECG-echocardiography pairs. DL-based models were trained to predict 12 echocardiographic findings related to heart failure. Logistic regression was applied to generate a composite label, considered positive if any of the 12 findings were present. Model performance metrics, including area under the receiver-operating characteristic curve (AUROC), sensitivity, and specificity, were evaluated for AF, PVC, and PM subgroups individually, as well as for groups stratified by the presence or absence of any arrhythmia. Results The composite label achieved an AUROC of 0.78 in patients without arrhythmias, which decreased to 0.75 in those with any arrhythmia. Subgroup analyses revealed that this decline was primarily driven by AF and PM, with AUROCs of 0.68 and 0.62, respectively. In contrast, PVC showed a relatively smaller impact, with an AUROC of 0.80. Further examination of the confusion matrix revealed that in the presence of AF or PM, the model classified all cases as positive for the composite label, resulting in 100% sensitivity but zero specificity. This pattern suggests that the model struggles to distinguish between positive and negative cases in these subgroups, likely due to altered ECG signal patterns or limited training data. Conclusions Arrhythmias significantly impact the performance of DL-based models for predicting echocardiographic abnormalities. While sensitivity remains high, reduced specificity in AF and PM groups limits clinical utility and may lead to increased false positives. Further investigation is needed to determine whether these limitations stem from inadequate training data or intrinsic model weaknesses. This study highlights the need to address arrhythmia-related limitations to improve the robustness of DL-based screening tools in clinical practice.ROC curve: with and without arrhythmia

Me gusta

Guardar

Cite This Study

Fujiki et al. (Sat,) reported a other. Deep learning model AUROC for predicting echocardiographic abnormalities dropped from 0.78 (no arrhythmia) to 0.68 in AF and 0.62 in pacemaker rhythm, with 100% sensitivity but 0% specificity.

synapsesocial.com/papers/698828eb0fc35cd7a8848daa https://doi.org/https://doi.org/10.1093/eurheartj/ehaf784.4389

Me gusta

Guardar