What question did this study set out to answer?

The aim is to explore how the TriadicGPT model learns to generate prime-factor signatures while maintaining language quality.

March 26, 2026Open Access

End-to-End Prime Factorization in a Generative Language Model: Emergent Algebraic Semantics from Joint Training

Key Points

The aim is to explore how the TriadicGPT model learns to generate prime-factor signatures while maintaining language quality.
Developed a 40M-parameter GPT model with a triadic projection head.
Conducted 29+ training runs and systematic ablation studies.
Introduced a dual-objective training loss combining language modeling and embedding alignment.
Triadic head had negligible impact on language quality, with perplexity scores indicating slight improvement.
Semantic ordering and similarity metrics improved progressively as model complexity increased.
Achieved 100% accuracy in analogy verification and significant gains in knowledge acquisition over training cycles.

Abstract

We present TriadicGPT, a 40M-parameter GPT language model augmented with a triadic projection head that produces discrete prime-factor signatures alongside standard nexttoken predictions. Unlike the post-hoc approach of the Triadic-Neurosymbolic-Engine Ornelas Brand, 2026, which projects frozen sentence embeddings into prime composites, TriadicGPT learns triadic representations end-to-end through a dual-objective training lossthat combines language modeling with a novel embedding alignment objective. Across 29+ training runs and systematic ablation studies, we demonstrate eight principal ndings: (1) the triadic head adds negligible cost to language quality (perplexity7. 69 vs. 7. 56 ablation baseline, +1. 7%) ; (2) semantic ordering emerges gradually withscalethe gap between related and unrelated concept similarity crosses zero around 20Mparameters, with a smooth crossover rather than a sharp phase transition; (3) a bits sweepover k ∈ 8, 16, 32, 48, 64, 128 reveals an optimal regime at k = 3264, shifted upward fromthe k = 612 range reported for post-hoc projection; (4) a transfer experiment attaching the triadic head to pre-trained GPT-2 with an InfoNCE alignment loss closes 48% ofthe gap to the Triadic Engine's post-hoc projection; (5) a subsumption loss recovers100% held-out subsumption at k = 64, resolving the primary limitation at high bitcounts; (6) an iFSQ activation (2σ (1. 6x) − 1) resolves the subsumptionlanguage tradeoentirely: language quality is preserved (loss 0. 9240. 951 vs. baseline 0. 946) while achieving up to 87. 1% held-out subsumption, compared to +47% perplexity degradation undertanh; (7) compositional analysis reveals that the bit space functions as a computationalsubstrateround-trip accuracy (98. 1%) far exceeds the multiplicative prediction (81. 9%), two-step chains show sub-linear error accumulation, and fork analysis conrms that themechanism is ontologically categorical, not vectorial; and (8) a discovery loop expandingfrom 50 hand-labeled anchors to 158 improves holdout accuracy from 87% to 93% and subsumption from 90. 7% to 98. 3%, demonstrating that the traindiscovercorrectretrain cyclescales semantic knowledge beyond initial supervision. TriadicGPT achieves 98% analogy verication (50/51 analogies), 100% signature uniqueness, and reproducible semantic ordering (+0. 038 ± 0. 005 gap, n = 3, 95% CI positive) allwithin a single forward pass. Critically, the triadic head's value lies in algebraic operations: 100% analogy verication, 87. 1% held-out subsumption, and sub-linear compositional chaining, none of which random projections can provide.

Read Full Paperexternally

Bookmark

View Full Paper

Cite This Study

J. Arturo Ornelas Brand (Tue,) studied this question.

synapsesocial.com/papers/69c4cdcdfdc3bde44891a950 https://doi.org/https://doi.org/10.5281/zenodo.19206545

Bookmark

View Full Paper