Los puntos clave no están disponibles para este artículo en este momento.
This paper explores the complexity of deep feedforward networks with linear pre-synaptic couplings and rectified linear activations. This is a contribution to the growing body of work contrasting the representational power of deep and shallow network architectures. In particular, we offer a framework for comparing deep and shallow models that belong to the family of piecewise linear functions based on computational geometry. We look at a deep rectifier multi-layer perceptron (MLP) with linear outputs units and compare it with a single layer version of the model. In the asymptotic regime, when the number of inputs stays constant, if the shallow model has kn hidden units and n₀ inputs, then the number of linear regions is O (k^n₀n^n₀). For a k layer model with n hidden units on each layer it is Ω (n/n₀^k-1n^n₀). The number /n₀^k-1 grows faster than k^n₀ when n tends to infinity or when k tends to infinity and n 2n₀. Additionally, even when k is small, if we restrict n to be 2n₀, we can show that a deep model has considerably more linear regions that a shallow one. We consider this as a first step towards understanding the complexity of these models and specifically towards providing suitable mathematical tools for future analysis.
Pascanu et al. (Fri,) studied this question.