In recent years, deep neural networks have achieved remarkable success across a wide range of tasks, from image recognition and language processing to scientific discovery. Despite their empirical performance, the theoretical foundations of deep learning remain poorly understood. This thesis investigates the mathematical structure, capabilities, and limitations of ReLU neural networks, focusing on two central aspects: expressivity and verification. Leveraging the fact that ReLU networks compute continuous piecewise linear (CPWL) functions, we apply tools from polyhedral geometry, combinatorics, and computational complexity. We first investigate expressivity questions: How deep or large must a network be to represent a given function? We prove lower bounds on the depth required to compute the max function with ReLU networks whose nonlinearities lie on the braid arrangement. We also show that, for convex CPWL functions, one can smoothly trade off depth and size when designing networks. We then extend this interpolation to certain nonconvex functions efficiently decomposable into differences of convex functions. Next, we develop a polyhedral framework for such decompositions of CPWL functions. While these decompositions always exist, little is known about their complexity. We introduce the decomposition polyhedron, which captures all decompositions compatible with a given polyhedral complex, and show that minimal decompositions must be among the vertices. This approach yields a finite method for finding optimal decompositions with the compatibility restriction and reveals structural uniqueness in specific cases. We then turn to classification tasks and study the topological expressivity of ReLU networks via the Betti numbers of their decision regions. We prove lower bounds and upper bounds depending on the size of the network, demonstrating a sharp separation between shallow and deep architectures in terms of their ability to model topologically complex data. Finally, we address the computational complexity of verifying structural properties. We prove that deciding injectivity of a ReLU network is coNP-complete, and we give an algorithm that establishes fixed-parameter tractability of the problem with respect to the input dimension. We also establish hardness results for simple verification tasks and initiate the complexity-theoretic study of surjectivity, linking it to problems in computational convexity. Altogether, this thesis develops new techniques to understand the capabilities and limitations of ReLU networks by gaining insights into the structure of the functions they can represent and the inherent difficulty of verifying their properties.
Moritz Grillo (Thu,) studied this question.