What question did this study set out to answer?

This research aims to establish a new approximation theorem for transformer architectures when working with low-dimensional manifolds.

May 7, 2026Open Access

Dimension-Independent Approximations on Low-Dimensional Manifolds Using Transformers

Key Points

This research aims to establish a new approximation theorem for transformer architectures when working with low-dimensional manifolds.
Developed a non-asymptotic approximation theorem for single-head ReLU-transformers with vector inputs.
Conducted numerical experiments using circles embedded in various ambient dimensions to test predictions.
Proved that approximation error is dependent solely on intrinsic dimension d, not on ambient dimension D.
Observed approximation errors remained nearly constant across varying dimensions, supporting the theory of ambient-dimension independence.

Abstract

Deep neural networks have been remarkably successful in high-dimensional learning and scientific computing, often succeeding where classical discretization methods fail due to the curse of dimensionality. This efficacy is often explained by their approximation properties combined with the manifold hypothesis: the idea that although data are embedded in dimension D, the effective degrees of freedom are governed by a much smaller intrinsic dimension d≪D. Under this hypothesis, data are concentrated near a low-dimensional manifold that neural networks can approximate efficiently. While the approximation theory for fully-connected ReLU networks on manifolds is well established, a comparable theory for transformer architectures, the dominant model class in modern foundation models, is still emerging. In this paper, we prove a new non-asymptotic, uniform approximation theorem for a class of single-head ReLU-transformers acting on vector inputs, where the approximation error depends only on the intrinsic dimension d rather than on the ambient dimension D. To the best of our knowledge, this is the first transformer approximation result that combines an intrinsic-dimensional rate with an ambient-dimension-independent multiplicative constant. We include a numerical experiment using a circle embedded in ambient dimensions of various sizes, showing that the observed error remains nearly unchanged as D varies, in agreement with the predicted ambient-dimension independence.

Dimension-Independent Approximations on Low-Dimensional Manifolds Using Transformers

Key Points

Abstract

Cite This Study