We study the emergence of cognitive structure in Vector Quantization (VQ) systems operating under physical bandwidth constraints. Through systematic experimentation on operational maintenance language, we identify a phase transition governed by the ratio N/K (number of training samples to codebook size): below a critical threshold γc ≈ 25–30, VQ codebooks spontaneously separate retrieval-type and inference-type communications into distinct Voronoi regions, without explicit supervision. We formalize this as the Cognitive Emergence Law: N/K < C · dcog, where dcog measures the intrinsic cognitive separability of the input domain (Cohen's d = 2. 45 on raw BPE tokens) and Cₑmp = 0. 391 ≈ 1/e = 0. 368 (6. 3% deviation explained by mean-field corrections). We show that this emergence is driven by the inputs themselves rather than the reward model — random reward outperforms MiniLM reward (2/3 vs 1/3 seeds significant at K=128, N/K=1. 55), confirming that cognitive structure is a property of operational language under compression, not an artifact of supervision. These results have direct implications for bandwidth-constrained AI inference systems (SMS, 2G, LoRa, satellite), where semantic compression must preserve functional distinctions rather than topical similarity. The critical ratio N/K < C · dcog provides a deployable design parameter for such systems. Preprint. HAL identifier: hal-05596229.
Théophile Lafargue (Sat,) studied this question.