What question did this study set out to answer?

This research aims to create a more efficient language modeling architecture that scales better than existing models.

March 21, 2026Open Access

LaminarNet: Linear-Time Multi-Scale State Propagation for Efficient Language Modeling

Key Points

This research aims to create a more efficient language modeling architecture that scales better than existing models.
Developed LaminarNet architecture to replace quadratic self-attention with innovative mechanisms.
Implemented Geometric Drift Field (GDF) for selective state propagation.
Utilized Cross-Stratum Routing (CSR) for hierarchical token interaction.
Tested performance with a 437M-parameter model trained on 9.56B Turkish tokens.
Achieved a 38.3% reduction in perplexity compared to traditional Transformers.
Increased throughput by 63.3%, allowing faster processing.
Reduced VRAM usage by 38.8%, making it more memory-efficient.
Demonstrated strong scaling behavior with a perplexity of 11.38 on large datasets.

Abstract

Transformer architectures dominate modern language modeling but incur O (N²) computational and memory costs with respect to sequence length, limiting scalability. We introduce LaminarNet, a linear-time architecture that replaces quadratic self-attention with two key mechanisms: Geometric Drift Field (GDF) for selective state propagation and Cross-Stratum Routing (CSR) for multi-scale hierarchical token interaction. LaminarNet achieves significant improvements in efficiency and performance. In a parameter-matched benchmark (49M parameters), it reduces perplexity by 38. 3%, increases throughput by 63. 3%, and reduces VRAM usage by 38. 8% compared to Transformers. Additionally, a 437M-parameter model trained on 9. 56B Turkish tokens demonstrates strong scaling behavior, achieving a perplexity of 11. 38. Code: https: //github. com/Uunan/LaminarNet Model: https: //huggingface. co/Uunan/LaminarNet₄37MTurkishBase

Read Full Paperexternally

Perguntar à IA

Bookmark

View Full Paper

Cite This Study

Ugurhan Colak (Wed,) studied this question.

synapsesocial.com/papers/69be38da6e48c4981c6798c7 https://doi.org/https://doi.org/10.5281/zenodo.19098613

Also Consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

Perguntar à IA

Bookmark

View Full Paper