May 6, 2024Open Access

Rigorous Dynamical Mean-Field Theory for Stochastic Gradient Descent Methods

Key Points

Key points are not available for this paper at this time.

Abstract

.We prove closed-form equations for the exact high-dimensional asymptotics of a family of first-order gradient-based methods, learning an estimator (e.g., M-estimator, shallow neural network) from observations on Gaussian data with empirical risk minimization. This includes widely used algorithms such as stochastic gradient descent (SGD) or Nesterov acceleration. The obtained equations match those resulting from the discretization of dynamical mean-field theory equations from statistical physics when applied to the corresponding gradient flow. Our proof method allows us to give an explicit description of how memory kernels build up in the effective dynamics and to include nonseparable update functions, allowing datasets with nonidentity covariance matrices. Finally, we provide numerical implementations of the equations for SGD with generic extensive batch size and constant learning rates.Keywordsstochastic gradient descentdynamical mean-field theoryiterative Gaussian conditioningMSC codes68Q2568W9960G9962J99

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Cédric Gerbelot

Emanuele Troiani

Francesca Mignacco

Journals

SIAM Journal on Mathematics of Data Science

Actions

Institutions

Centre National de la Recherche Scientifique

École Polytechnique Fédérale de Lausanne

Commissariat à l'Énergie Atomique et aux Énergies Alternatives

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Rigorous Dynamical Mean-Field Theory for Stochastic Gradient Descent Methods

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study