What is the clinical evidence from this study?

Study design: Other. Population: ECG signal processing (n=300). Intervention: Denoising Autoencoder (DAE) vs. Classical Autoencoder (AE). Primary outcome: Average pairwise correlation index between network model outputs.

What does this research mean for the field?

Adding autoencoder-generated reconstructions to the training set causes catastrophic performance collapse in ECG signal analysis. Novelty: ClaimNovelty.NOVEL_FINDING. Consensus alignment: ConsensusAlignment.CHALLENGES_CONSENSUS.

What question did this study set out to answer?

This study aims to investigate the effects of augmenting training data with autoencoder-generated outputs on model performance.

March 16, 2026Open Access

Training autoencoders on their own outputs causes collapse

Key Result

Training autoencoders on their own outputs caused a drastic drop in performance, leading to nearly complete correlation of outputs after approximately 100,000 steps.

Key Points

This study aims to investigate the effects of augmenting training data with autoencoder-generated outputs on model performance.
Trained classical autoencoders and denoising autoencoders on ECG signal data.
Generated reconstructions of the input data for potential data augmentation.
Evaluated performance by comparing results with and without augmented data.
Augmenting the training set with autoencoder outputs led to a significant drop in model performance.
The performance collapse was categorized as catastrophic, highlighting risks in using generated data.

Structured PICO

Population

300 ECG segments from 30 healthy patients (10 segments per patient), each containing P-wave, QRS-complex, and T-wave (95 datapoints per segment).

Intervention

Recursive data manipulation where autoencoder (AE) or denoising autoencoder (DAE) generated reconstructions are added back into the training set.

Comparator

Autoencoders trained only on original reference data without data feedback.

Outcome

Average pairwise correlation index between network model outputs to measure network collapse.

Training autoencoders on their own generated ECG data causes catastrophic performance collapse, demonstrating the risks of AI data self-contamination.

Limitations

Self-contamination of training data potentially limits generalizability.
Only 30 healthy patients were used for the ECG segments.

Abstract

Classical autoencoders (AE) learn a compressed, meaningful representation of the input data and denoising autoencoders (DAE) capture the true underlying data manifold even when inputs are noisy. Data is the foundation of artificial intelligence, and thus for all autoencoder types. However, all types produce, when well trained, output data which are similar to the input data. This could lead to output data being added to the data that is to be used for further learning. We show on ECG signals that adding AE/DAE-generated reconstructions to the training set — intended to augment data — causes catastrophic performance collapse.

Bookmark

View Full Paper

Cite This Study

Thomas Schanze (Wed,) conducted a other in ECG signal processing (n=300). Denoising Autoencoder (DAE) vs. Classical Autoencoder (AE) was evaluated on Average pairwise correlation index between network model outputs. Training autoencoders on their own outputs caused a drastic drop in performance, leading to nearly complete correlation of outputs after approximately 100,000 steps.

synapsesocial.com/papers/69b79e538166e15b153ab787 https://doi.org/https://doi.org/10.18416/automed.2026.2470

Bookmark

View Full Paper