What type of study is this?

September 5, 2025Open Access

Asymptotic generalization error in the online learning of linearized two-layer neural networks

Key Points

The study finds that training dynamics converge to a finite generalization error that does not vanish.
Analytical predictions show that the random-feature contribution to the output is suppressed with trained first-layer weights.
Using tools from statistical mechanics, the study evaluates high-dimensional limits for achieving minimal test errors.
Numerical simulations confirm analytical results, illustrating the interplay between features and linearization in generating generalization error.

Abstract

Abstract We investigate the generalization properties of over-parameterized, two-layer neural networks in the so-called ‘lazy training’ regime, where weight updates remain small around their initial values. Using a student-teacher framework, we focus on the interplay between random features and first-layer linearization in determining the minimal achievable test error. Our analysis uses tools from statistical mechanics to study a high-dimensional limit in which the numbers of input features and hidden units both tend to infinity with a finite ratio K / N . We find that the random-feature contribution to the student’s output is effectively suppressed when the first-layer weights are also trained, yielding a finite plateau in the generalization error. By explicitly linearizing in the changes of hidden-unit weights, we derive a closed-form expression for this asymptotic error plateau. Numerical simulations confirm our analytical predictions, showing that the training dynamics converge to a small, finite generalization error that does not vanish, even as K , N → ∞ . These findings illustrate how training the first-layer weights modifies the random feature model results.

Read Full Paperexternally

Demander à l'IA

Bookmark

View Full Paper