December 31, 2019

Accelerating Minibatch Stochastic Gradient Descent Using Typicality Sampling

Key Points

Key points are not available for this paper at this time.

Abstract

Machine learning, especially deep neural networks, has developed rapidly in fields, including computer vision, speech recognition, and reinforcement learning. Although minibatch stochastic gradient descent (SGD) is one of the most popular stochastic optimization methods for training deep networks, it shows a slow convergence rate due to the large noise in the gradient approximation. In this article, we attempt to remedy this problem by building a more efficient batch selection method based on typicality sampling, which reduces the error of gradient estimation in conventional minibatch SGD. We analyze the convergence rate of the resulting typical batch SGD algorithm and compare the convergence properties between the minibatch SGD and the algorithm. Experimental results demonstrate that our batch selection scheme works well and more complex minibatch SGD variants can benefit from the proposed batch selection strategy.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Xinyu Peng

China Jiliang University

Li Li

Ningbo University

Fei‐Yue Wang

Chinese Academy of Sciences

Journals

IEEE Transactions on Neural Networks and Learning Systems

Actions

Institutions

Chinese Academy of Sciences

Tsinghua University

Shandong Institute of Automation

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Cite this study

Peng et al. (Tue,) studied this question.

synapsesocial.com/papers/6a1c0700b33628da419d20f5 — DOI: https://doi.org/10.1109/tnnls.2019.2957003

Also consider

Synapse has enriched 3 closely related papers on similar clinical questions. Consider them for comparative context:

On the Convergence of Adam and Beyond· 2019 · 1,616 citations
Neuro Dynamic Programming· 2013 · 294 citations
On the momentum term in gradient descent learning algorithms· 1999 · 2,336 citations

Accelerating Minibatch Stochastic Gradient Descent Using Typicality Sampling

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study

Also consider

Also consider