What question did this study set out to answer?

The study aims to prove that detecting AI-generated text is fundamentally impossible without cooperation from the text generator.

February 22, 2026Open Access

Indistinguishable by Design: On the Impossibility of AI Text Detection Without Generator Cooperation

Key Points

The study aims to prove that detecting AI-generated text is fundamentally impossible without cooperation from the text generator.
Formulated the detection problem using statistical hypothesis testing.
Developed theoretical proofs connecting detection advantage to human text distribution.
Extended findings to conditional distributions and analyzed effects of post-processing.
Introduced a multi-sample detection bound on detection advantages and demonstrated implications for distributional convergence.
Proved that the maximum detection advantage equals the total variation distance between distributions.
Established that detection advantage vanishes as cross-entropy loss approaches human text entropy.
Demonstrated degradation of detection ability through post-processing according to the data processing inequality.
Outlined a systematic taxonomy tying detection limitations to mathematical constraints.

Abstract

We present a unified formal framework proving that AI-generated text detection is information-theoretically impossible when the model's output distribution converges to the human text distribution, absent generator cooperation (e.g., watermarking). We formalize the detection problem as a statistical hypothesis test between two distributions over text strings and prove that the optimal distinguishing advantage of any detector is exactly equal to the total variation distance between the human and model distributions (Theorem 4.1). We connect this result to the language model training objective, showing via Pinsker's inequality that as cross-entropy loss approaches the entropy of human text, the maximum achievable detection advantage provably vanishes (Theorem 4.4). We extend the impossibility to conditional (per-context) distributions (Theorem 4.8), prove that post-processing can only degrade detection via the data processing inequality (Theorem 4.5), and establish a multi-sample detection bound showing that even with n independent texts from the same source, the optimal detection advantage grows at most as √n times the per-sample divergence, so that distributional convergence still dominates (Theorem 4.10). We connect distributional convergence to empirical scaling laws and provide a systematic taxonomy mapping every major class of AI text detection to the specific theoretical result that bounds its performance. Our results establish that, under the empirically supported assumption of distributional convergence, the failure of AI text detectors is not an engineering shortcoming but a mathematical consequence of the training objective, absent generator cooperation.

Indistinguishable by Design: On the Impossibility of AI Text Detection Without Generator Cooperation

Key Points

Abstract

Cite This Study