March 25, 2024Open Access

Low-dimensional intrinsic dimension reveals a phase transition in gradient-based learning of deep neural networks

Key Points

Key points are not available for this paper at this time.

Abstract

Abstract Deep neural networks complete a feature extraction task by propagating the inputs through multiple modules. However, how the representations evolve with the gradient-based optimization remains unknown. Here we leveraged the intrinsic dimension of the representations to study the learning dynamics and found that the training process underwent a phase transition from expansion to compression under disparate training regimes---a phenomenon that is ubiquitous across a wide variety of model architectures, optimizers, and data sets. We showed that the variation in the intrinsic dimension is consistent with the complexity of the learned hypothesis, which can be quantitatively assessed by the critical sample ratio that was rooted in adversarial robustness. Meanwhile, we mathematically demonstrated that this phenomenon can be analyzed in terms of the mutable correlation between neurons. Although the evoked activities obey a power-law decaying rule in biological circuits, we identified that the power-law exponent of the representations in deep neural networks predicted adversarial robustness well only at the end of the training but not during the training process. These results together suggest that deep neural networks are prone to producing robust representations by adaptively eliminating or retaining redundancies. The code is publicly available at https: //github. com/cltan023/learning2022.

Read Full Paperexternally

Ask AI

Mark Helpful

Bookmark

Relay

View Full Paper