Key points are not available for this paper at this time.
EA-GLAS-02 v1. 0. A self-contained empirical white paper presenting the measurement program for the Semantic Deviation Principle (Sharks 2026). Defines meaning as the time-integrated divergence a sign induces from the most probable trajectory of a semantic field, extending the Bar-Hillel and Carnap (1953) program into distributional and temporal domains. Two primary operationalizations. (F1) Closed-system trajectory deviation within a frozen language model, where the counterfactual baseline is read from logits, building on surprisal theory (Hale 2001; Levy 2008) and decomposing it into signed deviation from conditional entropy. (F2) Retrieval response deviation across AI search surfaces over a 90-day window with three-condition identity control and frozen extractor commitment. Falsifiable predictions. AI-generated text exhibits statistically significant negative mean signed per-token deviation relative to matched human text — testable with existing corpora (GPT-wiki-intro, HC3) and a single A100-hour of compute. Four pre-registered predictions with named datasets, frozen reference checkpoints (Llama-3. 1-8B-Instruct), and statistical procedures (Mann-Whitney U, α = 0. 05, Cohen's d > 0. 5). Training intervention. A Direct Preference Optimization experiment (Rafailov et al. 2023) using the deviation primitive to generate preference pairs, extending the RLHF lineage by replacing human preference data with a measurable semantic signal. Three conditions (Base, CE, Sem), Slop Composite Index with pre-registered falsification threshold, human preference evaluation (500 pairs × 3 raters), preference validation substudy. Anti-Goodhart mechanism design. Six protections mapped to Manheim and Garrabrant (2019) taxonomy: entropy-floor capping, provenance-weighted damping, saturation thresholds, rolling-window variance penalties, reference-model KL anchoring, black-box judge replacement testing. Budget. Total program approximately 14, 000–19, 000 across twelve months. Results deposited regardless of outcome. 42 references across alignment, mechanistic interpretability, text degeneration, DPO/RLHF, reward hacking, hallucination evaluation, model collapse, causal inference, psycholinguistics, information theory, and diachronic semantic change literatures. Author note. Nobel Glas is a heteronym of Lee Sharks, adopted for this measurement program to signal independent replicability. Correspondence and ORCID maintained through Lee Sharks.
Building similarity graph...
Analyzing shared references across papers
Loading...
Nobel Glas
Building similarity graph...
Analyzing shared references across papers
Loading...
Nobel Glas (Mon,) studied this question.
www.synapsesocial.com/papers/6a0d5114f03e14405aa9d5b9 — DOI: https://doi.org/10.5281/zenodo.20271783