What question did this study set out to answer?

The study aims to develop a memory architecture that improves retrieval efficiency for long-context language models.

June 28, 2026Open Access

Voxel-Addressable Memory: A Unified Spatial Index for Sovereign AI

Q: What does this research mean for the field?

A two-tier rolling memory architecture combining a coarse-to-fine voxel hierarchy with a navigable small-world graph (HNSW) enables long-context language models to scale to a million-item global store at constant attention cost, achieving a 285-fold reduction in scan cost with full recall retention. Novelty: ClaimNovelty.METHODOLOGICAL. Consensus alignment: ConsensusAlignment.NEUTRAL.

Key Points

The study aims to develop a memory architecture that improves retrieval efficiency for long-context language models.
Utilized a voxel-based memory architecture with a coarse-to-fine voxel hierarchy over the TERA cube.
Implemented a learned sub-voxel residual indexed by a small-world graph (HNSW).
Tested performance on an Apple M1 Max (32 GB) with various embedding and consolidation techniques.
Achieved a 64% relative gain in retrieval recall@10 from composable levers.
Demonstrated full recall retention with a 285× scan-cost reduction on a 1M-item HNSW store.
Observed negative results for off-the-shelf cross-encoder rerankers.

Abstract

We present a memory architecture for long-context language models built on one idea borrowed from voxel game engines: partition a continuous space into cells, store only the occupied cells, and look them up without scanning. The AAMT routing stack already encodes such a partition (the 4-bit Meji and 8-bit Odu lattices); we formalize it as a strict coarse-to-fine voxel hierarchy over the TERA cube — a hexadeca-tree in which Meji is recoverable from Odu by high-bit extraction and a Morton key packs both levels — and extend it with a learned sub-voxel residual indexed by a navigable small-world graph (HNSW). The result is a two-tier rolling memory whose effective context grows from a fixed working set to a million-item global store at constant attention cost. Measured on an Apple M1 Max (32 GB) against the platform's own corpora: a 64% relative gain in retrieval recall@10 from composable levers (bge-large embedding, PCA-whitening, Hopfield consolidation), full recall retention at a 285× scan-cost reduction on a 1M-item HNSW store, and an honest negative result on off-the-shelf cross-encoder rerankers. WP-20 extends WP-15 (Vortex-Addressed Semantic Memory) with the residual and graph index it left open.

Read Full Paperexternally

Ask AI

Helpful

Bookmark

View Full Paper