May 21, 2024Open Access

How glassy are neural networks?

Key Points

Key points are not available for this paper at this time.

Abstract

Deep Neural Networks (DNNs) share important similarities with structural glasses. Both have many degrees of freedom, and their dynamics are governed by a high-dimensional, non-convex landscape representing either the loss or energy, respectively. Furthermore, both experience gradient descent dynamics subject to noise. In this work we investigate, by performing quantitative measurements on realistic networks trained on the MNIST and CIFAR-10 datasets, the extent to which this qualitative similarity gives rise to glass-like dynamics in neural networks. We demonstrate the existence of a Topology Trivialisation Transition as well as the previously studied under-to-overparameterised transition analogous to jamming. By training DNNs with overdamped Langevin dynamics in the resulting disordered phases, we do not observe diverging relaxation times at non-zero temperature, nor do we observe any caging effects, in contrast to glass phenomenology. However, the weight overlap function follows a power law in time, with an exponent of approximately -0.5, in agreement with the Mode-Coupling Theory of structural glasses. In addition, the DNN dynamics obey a form of time-temperature superposition. Finally, dynamic heterogeneity and ageing are observed at low temperatures. These results highlight important and surprising points of both difference and agreement between the behaviour of DNNs and structural glasses.

Read Full Paperexternally

AIに質問

Bookmark

View Full Paper