Large Language Models exhibit impressive linguistic competence, yet this has given rise to a foundationaldebate as to whether their capacity for generalization stems from human-like systematicity or mere stochastic parroting.Central to this is the principle of compositionality. This survey clarifies the mathematical requirements forcompositionality by examining major paradigms—from n-grams to deep neural architectures—through the lens ofhomomorphisms between syntactic and semantic algebras. We demonstrate that the limitations of these modelsconverge to the curse of dimensionality, rendering the learning of compositionality an inherently ill-posed inverseproblem. To address this, we propose formalizing learning as an inverse problem in representation theory, where linguisticsymmetries act as essential regularization to constrain the hypothesis space. We further suggest that geometricmachine learning, leveraging these symmetries as inductive biases, offers a novel mechanism, potentially utilizingmathematical formulations like Clifford algebra, where the establishment of homomorphisms serves as a critical indicatorof compositional generalization. This framework redirects the research focus toward elucidating the algebraicstructure of language itself.
Maeda et al. (Wed,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: