September 9, 2016

Deep vs. shallow networks: An approximation theory perspective

Key Points

Key points are not available for this paper at this time.

Abstract

The paper briefly reviews several recent results on hierarchical architectures for learning from examples, that may formally explain the conditions under which Deep Convolutional Neural Networks perform much better in function approximation problems than shallow, one-hidden layer architectures. The paper announces new results for a non-smooth activation function — the ReLU function — used in present-day neural networks, as well as for the Gaussian networks. We propose a new definition of relative dimension to encapsulate different notions of sparsity of a function class that can possibly be exploited by deep networks but not by shallow ones to drastically reduce the complexity required for approximation and learning.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Cite this study

Mhaskar et al. (Fri,) studied this question.

synapsesocial.com/papers/6a0f925e5725bbd5cc5fddd3 — DOI: https://doi.org/10.1142/s0219530516400042

Authors

H. N. Mhaskar

Claremont Graduate University

Tomaso Poggio

Brigham and Women's Hospital

Journals

Analysis and Applications

Actions

Institutions

Massachusetts Institute of Technology

California Institute of Technology

McGovern Institute for Brain Research

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Deep vs. shallow networks: An approximation theory perspective

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Cite this study

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion