What question did this study set out to answer?

The study aims to develop a phonosemantic framework to improve the interpretability of AI semantics based on Sanskrit phonology.

April 16, 2026Open Access

Phonosemantic Grounding: Sanskrit as a Formalized Case of Motivated Sign Structure for Interpretable AI

Key Points

The study aims to develop a phonosemantic framework to improve the interpretability of AI semantics based on Sanskrit phonology.
Formalized a four-dimensional phonosemantic coordinate system from Sanskrit phonology.
Conducted a proof-of-concept experiment on 150 Sanskrit verbal roots to test articulatory locus groupings.
Reported three methods: hypothesis-driven axis scoring, a linear probe, and a blind TF-IDF clustering experiment.
Analyzed complexity showing memory and time efficiency with structural convergence to state-space models.
Axis scoring achieved a significance level of p ≈ 10⁻¹⁴.
Articulatory geometry led to a 63.3% group classification rate, compared to 49.3% for phoneme identity alone.
The blind TF-IDF clustering experiment was not significant at the tested scale.
Formalization of the continuous-time ODE underlying the resonance state model was completed.

Abstract

Modern language models represent meaning as statistical proximity in high-dimensional embedding spaces whose geometry is difficult to interpret. This paper proposes an alternative representation framework grounded in the physiology of speech production. We formalize a four-dimensional phonosemantic coordinate system (articulation locus, articulation manner, phonation type, somatic resonance locus) derived from the articulatory anatomy of Sanskrit phonology, define the phonosemantic manifold as a structured geometric substrate for AI embeddings, and propose the harmonic coherence metric as a physically interpretable replacement for cosine similarity. A proof-of-concept experiment on 150 Sanskrit verbal roots tests whether articulatory locus groupings predict semantic clustering against Monier-Williams dictionary definitions. Three complementary methods are reported: hypothesis-driven axis scoring (p ≈ 10⁻¹⁴), a linear probe showing articulatory geometry achieves 63.3% group classification vs. 49.3% for phoneme identity alone (+14 pp, p < 0.001), and a blind TF-IDF clustering experiment (not significant at this scale, reported in full). A complexity analysis shows the phonosemantic context model achieves O(1) memory and O(L) time, with structural convergence to state-space models such as Mamba. The paper also formalizes the continuous-time ODE underlying the resonance state model, proposes a phonosemantic decoding objective that reduces output vocabulary from 50,000 tokens to 50 phonemes, and connects the framework to the Information Bottleneck principle in representation learning.

Read Full Paperexternally

Mark Helpful

Bookmark

Relay

View Full Paper