July 25, 2025

Systematic Evaluation of Label Noise Effects on Accuracy and Calibration in Deep Neural Networks

Puntos clave

Label noise significantly degrades the accuracy and calibration of deep neural networks during training.
Asymmetric label noise at 60% corruption results in a drop in accuracy to about 38.7% and ECE exceeding 35%.
Symmetric noise at the same level shows less severe effects, with a modest ECE around 9%, indicating different impacts based on noise type.
The findings emphasize the need to distinguish between noise types to enhance model robustness and reliability.

Resumen

Abstract Label noise is a pervasive issue in real-world datasets that can degrade both the accuracy and calibration of deep neural networks. In this study, we systematically examine how symmetric (random) and asymmetric (class-dependent) label noise influence model accuracy and confidence calibration in image classification using the CIFAR-10 dataset and a ResNet-18 architecture. We apply five levels of label noise (0%, 10%, 20%, 40%, 60%) and evaluate their effects using metrics such as test accuracy, Expected Calibration Error (ECE), and predictive entropy. Our findings show that increasing noise levels significantly degrade classification accuracy and impair model calibration. In particular, asymmetric noise at a 60% corruption level causes test accuracy to drop to approximately 38.7% while ECE surges above 35%, indicating extreme overconfidence in incorrect predictions. By contrast, symmetric noise at the same noise level yields higher predictive entropy (uncertainty) and a comparatively modest miscalibration (ECE ∼9%). These results highlight the importance of distinguishing noise types when assessing model robustness and reliability. All experiments are reproducible, with code and data publicly available to facilitate further investigation.

Me gusta

Guardar

Cite This Study

Christopher Boseak (Thu,) studied this question.

synapsesocial.com/papers/689a0627e6551bb0af8ce1e3 https://doi.org/https://doi.org/10.21203/rs.3.rs-7197053/v1

Also Consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

Me gusta

Guardar