What type of study is this?

This is a Quantitative Study study.

October 16, 2025

Towards Better Generalization Bounds of Stochastic Optimization for Nonconvex Learning

Key Points

The analysis provides better generalization bounds for stochastic optimization under nonconvex conditions, improving existing methods.
Upper and lower bounds on uniform convergence of gradients are developed, incorporating the 2nd moment of the gradient for enhanced performance.
The findings demonstrate that further assumptions like quasi-convexity can lead to better bounds on the gradient norm for population risks in SGD.
Computational costs can be reduced with variance-reduction strategies, and performance under privacy constraints shows potential for distributed gradient computation.

Abstract

Stochastic optimization is the workhorse behind the success of many machine learning algorithms. The existing theoretical analysis of stochastic optimization mainly focuses on the behavior on the training dataset or requires a convexity assumption. In this paper, we provide a comprehensive analysis on the generalization behavior of stochastic optimization with nonconvex problems. We first present both upper and lower bounds on the uniform convergence of gradients. Our analysis outperforms existing results by incorporating the 2nd moment of the gradient at a single model into the upper bound. Based on this uniform convergence, we provide a high-probability bound on the gradient norm of population risks for stochastic gradient descent (SGD), which significantly improves the existing results. We show that better bounds can be achieved under further assumptions such as quasi-convexity or Polyak-Łojasiewicz condition. Our analysis shows the computation cost can be further decreased by taking the variance-reduction trick. Finally, we study the utility guarantee of SGD under a privacy constraint. Our results show a linear speed up with respect to the batch size, which shows the benefit of computing gradients in a distributed manner.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Yunwen Lei

University of Hong Kong

Journals

IEEE Transactions on Pattern Analysis and Machine Intelligence

Actions

Institutions

University of Hong Kong

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Towards Better Generalization Bounds of Stochastic Optimization for Nonconvex Learning

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study