This deposit contains the official PyTorch research scaffold, unit test suites, and formal mathematical proofs for Anamnesis—a resource-rational, budgeted long-context memory architecture combining Retentive Networks (RetNet), Block Attention Residuals (AttnRes), and Hashed N-gram Engram memory. Long-context language models must balance the low marginal cost of sequence streaming with high-fidelity exact recall of decision-critical states. This repository presents a complete proof-aligned implementation designed to explore budgeted memory boundaries without relying on a full Transformer Key- Value (KV) cache. Core Architectural Components 1. Time Axis (RetNet Streaming + Bounded Snapshots): Default sequence streaming handled by contractive recurrent states, augmented with a capped snapshot cache (K_) and an output-side vocabulary-logit projection readout to prevent recurrent decay and superposition noise. 2. Depth Axis (Zero-Parameter Block Attention Residuals): Replaces traditional heavily-parameterized cross-layer projections with pure softmax attention over preceding raw block states using a single learned pseudo-query parameter wₗ Rᵈ and an age-based distance penalty. 3. Local Pattern Axis (Hashed N-gram Engram): High-entropy, deterministic local pattern memory built with multi-head N-gram hash tables, isotropic scalar gating, and a dilated causal 1D convolution layer to expand the local receptive field.
Xingyu Xie (Wed,) studied this question.