Hierarchical Gated Delta Memory (HGDM) is an attention-free recurrent architecture for byte-level language modeling with constant inference-time memory. HGDM maintains a fixed-size recurrent matrix state updated through gated outer-product dynamics and multi-timescale decay mechanisms inspired by predictive coding and sparse representations. The architecture eliminates self-attention and external key-value caches while supporting long-context recurrent inference with fixed memory usage. This release contains the accompanying preprint describing the HGDM architecture, training methodology, Nitro Triton scan kernel, and experimental evaluations on byte-level language modeling and long-context retrieval tasks. Code repository:https://github.com/iam-saiteja/HGDM-Hierarchical-Gated-Delta-Memory
Thanniru Sai Teja (Tue,) studied this question.