Learning representations on large graphs is a fundamental challenge due to complex inter-dependencies. While Transformers excel on small graphs via global attention, existing architectures often mirror large language models by stacking deep attention layers. This design philosophy restricts the scalability of Transformers on large graphs, as the unique inter-dependency nature makes it non-trivial to losslessly partition a graph for modern accelerators. We provide a theoretical reassessment of whether deep attention is a necessity. Our analysis shows that for a generic hybrid propagation layer that combines global attention and graph-based propagation, multi-layer models can be reduced to one-layer counterparts without sacrificing representation capacity. Guided by these insights, we propose Simplified Single-Layer Graph Transformer (SGFormer), which utilizes single-layer global attention with approximation-free linear complexity. Unlike scalable Transformers that rely on stochastic approximations or restricted receptive fields, SGFormer scales exactly linearly w.r.t. graph sizes and requires none of any approximation for accommodating all-pair interactions. Empirically, it yields orders-of-magnitude inference acceleration over state-of-the-art Transformers on medium-sized graphs and scales smoothly to the web-scale ogbn-papers100M dataset (0.1B nodes) on a single GPU with 24GB memory. Our results suggest that principled simplification is a highly effective path for powerful, scalable foundation models for large-graph learning.
Building similarity graph...
Analyzing shared references across papers
Loading...
Qitian Wu
Broad Institute
Kai Yang
Shanghai Jiao Tong University
Hengrui Zhang
University of Illinois Chicago
IEEE Transactions on Pattern Analysis and Machine Intelligence
Broad Institute
University of Hong Kong
University of Illinois Chicago
Building similarity graph...
Analyzing shared references across papers
Loading...
Wu et al. (Thu,) studied this question.
synapsesocial.com/papers/6a1a7ecb0307b78509431508 — DOI: https://doi.org/10.1109/tpami.2026.3697944
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: