What question did this study set out to answer?

This research aims to understand the generalization capabilities of neural networks using a quantum information framework.

April 18, 2026Open Access

Quantum Information Framework for Neural Network Generalization: A Comprehensive Experimental Analysis

Key Points

This research aims to understand the generalization capabilities of neural networks using a quantum information framework.
Introduced a quantum information framework to analyze neural network representations.
Treated hidden layer activations as quantum states characterized by quantum metrics.
Conducted experiments on synthetic classification tasks spanning over 3000 training epochs and 5000 sampling iterations.
Analyzed von Neumann entropy and purity during training phases.
Neural networks achieved a test accuracy of 99.5% with reduced von Neumann entropy and increased purity.
Grokking phenomenon observed with significant entropy changes during memorization phases.
Correlation analysis revealed strong anticorrelation between von Neumann entropy and purity.
Effective rank of density matrix stabilized near 2.7 to 3.0 for the classifier.

Abstract

The remarkable generalization capabilities of overparameterized neural networks remain one of the most profound mysteries in modern machine learning. Despite having sufficient capacity to simply memorize training data, these networks consistently achieve superior performance on unseen test examples. Traditional analyses focusing on weight distributions, loss landscapes, and optimization dynamics have provided partial insights but fail to capture the complex correlational structure that emerges between neurons during learning. This paper introduces a comprehensive quantum information framework for analyzing neural network representations, treating hidden layer activations as quantum states and characterizing them through density matrices, von Neumann entropy, purity, and related quantum information metrics. Through systematic experiments on synthetic classification tasks and algorithmic reasoning problems spanning over 3000 cumulative training epochs and 5000 Wang-Landau sampling iterations, we demonstrate that these quantum metrics reveal structure completely invisible to classical analysis. Our results show that stochastic gradient descent systematically reduces von Neumann entropy while increasing purity as networks learn, transitioning from highly mixed quantum states with entropy approximately 1.42 and purity approximately 0.29 to purer configurations with entropy approximately 1.19 and purity approximately 0.37 that achieve superior test accuracy of 99.5%. Using Wang-Landau sampling to explore the equilibrium entropy landscape, we discover that neural networks naturally favor high-entropy states with entropy approximately 1.42 and moderate accuracy of 62.0%, and that optimization actively purifies the quantum state to achieve task performance. On modular arithmetic tasks exhibiting the grokking phenomenon, we observe that von Neumann entropy increases dramatically from 1.55 to 3.29, representing a 112% increase, during an extended 2000-epoch memorization phase before any improvement in generalization, while purity decreases correspondingly from 0.444 to 0.042, a 90.5% decrease, approaching the maximally mixed limit of 0.01. This entropy accumulation phase appears necessary for eventual generalization, with the network building quantum correlations at a rate of approximately 0.00097 nats per epoch during the plateau. Weight entropy, in contrast, shows minimal change of only 11% increase over the same period. Correlation analysis reveals that von Neumann entropy and purity are almost perfectly anticorrelated with a correlation coefficient of 1 -0.99, while weight entropy shows weak correlation with quantum metrics with absolute correlation coefficients less than 0.3. The effective rank of the density matrix stabilizes near 2.7 to 3.0 for the 6-neuron classifier, indicating that the network effectively uses approximately three independent activation modes. These findings establish quantum information metrics as powerful tools for understanding neural network learning dynamics, provide the first quantitative evidence for the entropy accumulation interpretation of grokking, and suggest that neural networks undergo a quantum-inspired phase transition during learning. We discuss implications for generalization theory, optimization algorithm design, and the statistical physics of deep learning, and outline a comprehensive research agenda for extending these methods to larger architectures and more complex tasks.

Read Full Paperexternally

Bookmark

View Full Paper

Cite This Study

Chi Hin Lam (Fri,) studied this question.

synapsesocial.com/papers/69e3215140886becb6540951 https://doi.org/https://doi.org/10.5281/zenodo.19617039

Bookmark

View Full Paper