Transformer self-attention and billion-node network analyses share a key limitation: all-to-all evaluation creates an O (N²) computational cost. Existing methods address this by either distributing the workload across hardware or substituting recurrent operators. This trades associative recall for efficiency. We present Reduced Interaction Sampling (RIS), a stochastic sparsification framework. RIS computes only a fraction of possible pairwise interactions. By leveraging topological redundancy in real-world networks, RIS separates structural accuracy from computational expense. For example, on the com-LiveJournal graph with 4 million nodes, RIS preserves the degree centrality rank (ρ = 0. 96) while using only 10% of the edges. A partition-based setup, RIS-Structural, identifies twice as many hubs as sliding-window methods under heavy sparsity (1. 00% vs 0. 50%, p=0. 033). In TinyLlama-1. 1B attention tests (0. 5k-65k tokens), RIS achieves a geometric reach of about 21k tokens at 65k—outperforming Longformer (≈2k) and BigBird (≈17k). Window-based models surpass 10⁵ Cumulative Attention Mass but lose 98% of hub recovery. This shows that dense scalar weights poorly reflect long-range geometric reach. RIS maintains a stable Hub Recall with up to 128 times longer sequences and an edge budget below 0. 01%. Stochastic sampling provides a mathematically robust way to scale context architectures without structural collapse.
Anderson Santos (Sat,) studied this question.