What question did this study set out to answer?

The aim is to provide a clear audit and operationalization of the Semantic Deviation Principle independent of existing institutional frameworks.

May 19, 2026Open Access

Audited Claims for the Semantic Deviation Research Program: The Glas Function — An External-Format Restatement

Key Points

The aim is to provide a clear audit and operationalization of the Semantic Deviation Principle independent of existing institutional frameworks.
Categorizes the Semantic Deviation Principle into three layers: technical core, philosophical interpretation, and institutional apparatus.
Proposes three operationalizations and six-component decomposition for analyzing contribution to training interventions.
Specifies a pre-registered negative-net-deviation slop test with publicly available datasets.
Identified semantic-field operationalization gap as a critical technical issue.
Predicted that provenance contributes more independent uplift than deviation, based on prior literature.
Outlined a budgeted roadmap for upcoming empirical work estimated between $14,000 and $19,000.

Abstract

EA-GLAS-01 v1. 0. A comprehensive, self-contained, externally-legible audit of the Semantic Deviation research program. Standalone document: requires no prior engagement with the institutional architecture surrounding the Semantic Deviation Principle to read, evaluate, or act upon. The paper performs the audit function: it takes the Semantic Deviation Principle (Sharks 2026, v0. 2 Final, DOI: 10. 5281/zenodo. 20250736) and its associated protocol papers as input, and returns a narrowed, citationally grounded, externally evaluable statement of the technical core. It does not amend the founding formulation. It does not depend on the institutional architecture that has accreted around the formulation. What the paper does: Distinguishes Layer A (technical core), Layer B (philosophical interpretation), and Layer C (institutional/symbolic apparatus) of the SDP corpus; commits to engaging only Layer A. Identifies the semantic-field operationalization gap as the program's load-bearing technical issue; proposes three canonical operationalizations (F1 closed-system, F2 retrieval response, F3 citation graph) with full specification tables for divergence functional D, temporal weighting w (t), and horizon T per operationalization. Narrows the headline claim from universal-ontology ("meaning is deviation") to measurement-architecture form: meaning-bearing interventions produce durable trajectory restructuring under specified field operationalizations. Proposes a six-condition component decomposition (Model-Base, Model-CE, Model-π, Model-Dev, Model-Coh, Model-Full) to isolate the contribution of provenance, deviation, and coherence components to training-intervention uplift; predicts provenance carries more independent uplift than deviation, grounded in Ji et al. 2023 and Min et al. 2023. Replaces the philosophical anti-extractive Vow with six concrete anti-Goodhart mechanisms (entropy-floor capping, provenance-weighted damping, saturation limits with operational calibration, temporal coherence penalties, KL anchoring, adversarial judge validation, black-box judge replacement test) grounded in Skalse et al. 2022, Krakovna et al. 2020, Gao et al. 2023. Specifies the cheapest dangerous test: a pre-registered negative-net-deviation slop test (~50 in compute, single A100-hour) with four falsifiable predictions (P1 slop signature, P2 pre/post-RLHF differential, P3 effect-size scaling, P4 cross-judge consistency), using publicly available datasets (GPT-wiki-intro, HC3) and frozen open-weight models (Llama-3. 1-8B-Instruct, Mistral-7B-Instruct). Provides sustained citational grounding in alignment, mechanistic interpretability, mode-collapse, hallucination, model-collapse, causal-inference, cultural-evolution, and diachronic-semantic-change literatures (37 references). Specifies a budgeted near-term roadmap (14, 000–19, 000 total) for the next twelve months of empirical work, independent of further architectural elaboration. What the paper does not do: Does not amend any existing deposit; does not require subscription to the institutional architecture surrounding the SDP corpus; does not claim universal-ontology meaning ≡ deviation; does not claim sufficiency of the proposed anti-Goodhart machinery; does not claim independence from the SDP corpus (it engages the corpus directly, but with operational independence — it can be read, evaluated, and acted upon without engaging the corpus's institutional architecture). Related deposits: Sharks, L. (2026). The Semantic Deviation Principle (v0. 2 Final). DOI: 10. 5281/zenodo. 20250736 Sharks & Glas (2026). The Semantic Deviation Principle (v2. 0 — operational re-edition). DOI: 10. 5281/zenodo. 20252584 Glas, N. (2026). The AI System as Closed-System Test Bed (MM-AI-01 v2. 0). DOI: 10. 5281/zenodo. 20251738 Glas, N. (2026). Measuring Meaning in Retrieval Basins (MM-02 v2. 0). DOI: 10. 5281/zenodo. 20251740 Glas, N. (2026). The Deviation-Optimized Language Model (MM-AI-02 v2. 0). DOI: 10. 5281/zenodo. 20251742

Read Full Paperexternally

AIに質問

Bookmark

View Full Paper