We study how convolutional neural networks reorganize information during learning in natural image classification tasks by tracking mutual information (MI) between inputs, intermediate representations, and labels. Across VGG-16, ResNet-18, and ResNet-50, we find that label-relevant MI grows reliably with depth while input MI depends strongly on architecture and activation, indicating that “compression’’ is not a universal phenomenon. Within convolutional layers, label information becomes increasingly concentrated in a small subset of channels; inference-time knockouts, shuffles, and perturbations confirm that these high-MI channels are functionally necessary for accuracy. This behavior suggests a view of representation learning driven by selective concentration and decorrelation rather than global information reduction. Finally, we show that a simple dependence-aware regularizer based on the Hilbert–Schmidt Independence Criterion can encourage these same patterns during training, yielding small accuracy gains and consistently faster convergence.
Issitt et al. (Mon,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: