NOTE: ILLUSTRATIVE NUMBERS - WIPLarge language models hallucinate because their training data carries no epistemic metadata: facts, hypotheses, value judgments, and acknowledged unknowns occupy the same embedding space with identical weight. We propose VKB-Training (Verified Knowledge Base Training), a data-centric approach that assigns each training sample a five-category epistemic tag (Fact, Model, Value, Hypothesis, BlindSpot), a calibrated confidence score, and a provenance chain. We introduce a four-stage hybrid annotation pipeline:(1) AI triangulation — multiple LLMs classify independently; inter-model disagreement signals normative content (the "Caesar/God boundary")(2) Human sampling with axiom extraction — domain annotators resolve high-disagreement cases; recurrent decision principles are extracted as reusable rules(3) Expert calibration with reputation weighting — formalized Galton's ox-weighing insight (per S.V.E. XI, DOI: 10.5281/zenodo.18109198)(4) Logical consistency filters — contradiction detection and symmetry verification via the CGS Method (DOI: 10.5281/zenodo.18776172) Three training mechanisms are proposed: confidence-weighted loss, provenance-aware attention, and a BlindSpot training objective that maximizes output entropy at known knowledge gaps. VKB-Training was first described as part of the CogOS framework (DOI: 10.5281/zenodo.18109244). This paper extracts and formalizes the VKB component as a standalone, empirically testable proposal with a falsifiable experimental protocol and pre-specified success thresholds. Section 7 (Ethical Data Sourcing: Author Revenue Sharing, 10-50%) is included in the preprint but will be omitted from the workshop submission. Prepared for submission to NeurIPS 2026 Workshop.
Building similarity graph...
Analyzing shared references across papers
Loading...
Artiom Kovnatsky
Laboratoire Spécification et Vérification
Building similarity graph...
Analyzing shared references across papers
Loading...
Artiom Kovnatsky (Sun,) studied this question.
www.synapsesocial.com/papers/69d49fc5b33cc4c35a228434 — DOI: https://doi.org/10.5281/zenodo.19430119