The dominant operational picture of AI alignment is specification-and-verification: fix a target, then test for conformance. We argue this picture is sound only where misalignment is coarse and actions are recoverable — and that the hardest problems lie where values are near-tied and noisily observed and actions cannot be undone. There, two floors limit what an agent can learn from its own actions: an identifiability floor, (σ²/Δ²)·log T, for resolving a decision-relevant gap Δ under noise σ, and an irreversibility floor of Ω(T) — an impossibility, not a rate — when the only informative action is itself the unrecoverable commitment. Both are lifted by one device: an external A-channel (asking, eliciting values, trialing before committing) that substitutes for the ability to undo. The resulting 2×2 is proven, in its irreversible half, in a companion matching-market paper backed by a reproducible artifact. Read into alignment, the spine forces a relational picture: deference becomes the uniquely learnable strategy, not an imposed constraint; transparency should be selective, because surveillance destroys the channel it monitors; legitimate persuasion is separated from manipulation by reflective endorsement, a criterion estimable but not certifiable; and collective alignment becomes the selection of a legitimate consensus under a small hard floor of irreversibility prohibitions, within which minority protection and ruin-avoidance are one principle. We are explicit about what is proven, what is argued, and what is left open: the relational picture is what a hard theorem leaves standing once the world is permitted to be irreversible and human values to be near-tied.
Kenji Masuda (Tue,) studied this question.