What question did this study set out to answer?

This research aims to develop a neural model for semantic representation based on speech production acoustics.

April 18, 2026Open Access

Sequential Phonosemantic Encoding in ODE Reservoirs: Breaking Static Baselines and Measuring Capacity Ceilings

Key Points

This research aims to develop a neural model for semantic representation based on speech production acoustics.
Utilizes continuous-time dynamics of ordinary differential equations (ODE) for processing verbal roots.
No global error signal or backpropagation is employed during training.
Analyzes the resonance state update and compares it with existing selective state-space models.
Investigates representational capacity in single-layer architectures.
Achieved an Adjusted Rand Index (ARI) of 0.069 on the Paninian benchmark, a 61% improvement over static embeddings.
Defined a capacity ceiling around 0.06 ARI, indicating limits for semantic abstraction in phonological reservoirs.
Established that deep meaning requires secondary hierarchical processing.

Abstract

We present a neural architecture for semantic representation grounded in the physical acoustics of speech production rather than statistical co-occurrence. The Receiver Model processes verbal roots by converting their sequential acoustic formant trajectories (F1, F2 frequencies) into heterogeneous continuous-time Ordinary Differential Equation (ODE) dynamics. No global error signal, backpropagation, or word-level semantic labels are used at any stage. We prove formally that (i) the resonance state update of the network is structurally identical to Mamba's selective state-space recurrence via zero-order-hold discretization; (ii) under local unconstrained dynamics, the weight matrix maximizes an explicit structural objective; and (iii) the harmonic coherence metric H over the phonosemantic manifold M is a proper pseudometric satisfying the triangle inequality. Empirically, we report that continuous-time sequential encoding achieves ARI = 0. 069 on the 150-root Paninian benchmark against independently-derived phenomenological axis labels, providing a +61% performance leap over static embedding geometries. We meticulously characterize the representational capacity ceiling of single-layer architectures, demonstrating a stable 0. 06 ARI barrier across exhaustive structural and Group Relative Policy Optimization (GRPO) sweeps. This defines the first empirical boundary for cross-locus semantic abstraction in zero-weight phonological reservoirs, providing formal justification that deep categorical meaning requires secondary hierarchical layer processing. The architecture achieves O (1) context memory and O (L) time, with physical, interpretable state coordinates.

Read Full Paperexternally

Mark Helpful

Bookmark

Relay

View Full Paper