This paper studies social choice under frame uncertainty when a strategic actor can shape, filter, or distort public evidence. Institutional admissibility is encoded as pairwise σ-fields Gxy ⊆ Arestricting the information usable to decide a pair (x, y). Under spurious unanimity (Mongin, 1997), expressed unanimity can coexist with latent frame disagreement, forcing any expressed-Pareto decision rule to misrank with probability at least the posterior mass of disagreeingframes. A natural repair is witnessing: if the electorate observes an i. i. d. witness stream E (n) whoseframe-conditioned laws pf f ∈F are separated, MAP triangulation achieves exponential decaywith exponent governed by minimal Chernoff information. The central result reverses thisconclusion in transparent regimes. When the strategic actor knows the witness metric in advanceand can tailor evidence-generation or reporting accordingly, witnessing becomes a control target: the actor can solve a drift-cancellation (stabilization) problem and erase the information contentof the witness. We quantify this via a capacity inequality. Define Witness Capacity CW as the minimal per-sample Chernoff information separating frames. Define Obfuscation Capacity CS as a verifiableKL-budget bounding how far the actor can shift frame-conditioned witness laws. We prove asharp aliasing boundary (a “Nyquist limit”): if CS ≥ CW, the robust Chernoff information can bedriven to zero (aliased), so no amount of data restores inference. If CS CS. Mathematically, the robust-testing structure is classical (Huber, 1964; Huber–Strassen, 1973) ; the novelty is the social-choice/RLHF isomorphism andthe capacity interpretation of Goodhart-type failures. Dynamic control connection. In many transparent regimes (including RLHF), obfuscationis implemented not as an arbitrary one-shot distribution shift but via closed-loop, bang-bangevidence control that stabilizes an admissible drift statistic at an indifference set-point. TheDSCT analysis in Fathi 3 proves (i) an implementability condition in convex-hull form and (ii) that switching-cost frictions induce a hysteresis band without, by themselves, restoringlearnability in per-sample units. This motivates treating CS as an institutional/actor capacitythat can remain large even under natural frictions; robustness requires commitment boundariesor hard resource constraints that lower effective CS below CW. We then describe two escape routes: (i) cryptographic commitments that secretly randomizeaudit metrics until after commitment, restoring private-witness timing against computationallybounded actors; and (ii) computational hardness arising from discrete, template-restrictedinstitutions. We close by mapping the framework to AI alignment via RLHF: transparent rubricsinduce strategic classification behavior (Hardt et al. , 2016) and enable reward hacking; theNyquist inequality predicts when interpretability-style metrics are structurally vulnerable.
Building similarity graph...
Analyzing shared references across papers
Loading...
Kevin Fathi
Building similarity graph...
Analyzing shared references across papers
Loading...
Kevin Fathi (Tue,) studied this question.
www.synapsesocial.com/papers/698ebf3485a1ff6a93016765 — DOI: https://doi.org/10.5281/zenodo.18609933