The empirical success of modern machine learning continues to outpace theoretical understanding. In particular, the principles governing its stability, generalization, and trainability remain incompletely understood. Nevertheless, machine learning has applications and has led to progress in many areas of natural sciences. Conversely, machine learning benefits from concepts originating in theoretical and statistical physics. This thesis develops a physics-inspired framework to investigate how information propagates and transforms in deep neural networks. Drawing an analogy between neural networks and Ising models, we analyze their phase structures and demonstrate that the network configuration for optimal trainability, which emerges near a critical point, can be derived by maximizing entropy-based information flow. Motivated by the role of negatively curved geometries in facilitating correlation spreading, we examine hyperbolic Ising models on disc topologies. To this end, we develop robust methods for constructing hyperbolic tessellations and their neighborhood relations and explore connections to JT gravity and discrete holography. Building on Ising model inspired entropy-based tools, we can classify neural networks as critical, subcritical, or supercritical based on the eigenvalue spectra of Gram matrices, which can be interpreted as analogues of density matrices. At the heart of the entropy measurement is a reconstruction mechanism that enables comparisons between network states. We validate this framework by predicting optimal trainability for multilayer perceptrons, convolutional and residual networks, autoencoders, and generative adversarial networks. Moreover, we show that optimizing information flow further improves final training accuracy. On simulated LHC data, multilayer perceptrons achieve roughly ten percentage-point improvements in their accuracy, while convolutional architectures exhibit architecture-dependent behavior driven by differences in spatial scale. The reconstruction mechanism also offers a visualization of how input information evolves into network predictions and thus touches the field of explainable artificial intelligence. Finally, we apply neural networks to the predict spectral density functions from imaginary Green’s functions. While neural networks match or slightly surpass established methods on synthetic data, they remain inferior on real physical data, highlighting limitations arising from discrepancies between training distributions and experimental measurements.
Yanick Thurn (Thu,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: