Key points are not available for this paper at this time.
The Turing Test historically transformed the philosophical question of machine intelligence into an operational behavioral criterion: can a machine imitate human conversation convincingly enough to deceive a human observer? That criterion was adequate for the technological context of the twentieth century, when sophisticated linguistic simulation was expected to be difficult. Contemporary large language models have rendered this assumption obsolete. Systems now routinely achieve behavioral indistinguishability while simultaneously displaying hallucination, inferential drift, retroactive rationalization, and structural collapse under contextual decomposition. This paper proposes the Moura Test as a structural alternative, grounded in the Fundamental Principle of Intelligence (FPI). The Moura Test does not evaluate intelligence directly; it evaluates whether a system can preserve epistemological structure without masking its absence through narrative fluency. Rather than evaluating behavioral output, it evaluates whether a system's internal reasoning structure persists with integrity under recursive contextual decomposition conducted by a human auditor. The paper surveys eight post-Turing alternatives and establishes that they address a different problem well: measuring capability-based performance on known distributions. The problem they do not cover is the one that emerged with high-capacity generative systems: a system can maximize performance on every existing criterion while simultaneously exhibiting total decompositional collapse. The Moura Test is designed for that problem. The impossibility of fully automating the Moura Test is argued to be the same claim as the distinction of object: the test cannot be fully automated for precisely the same structural reason it measures something different from capability. Six falsifiable predictions are stated, including a temporal boundary prediction for the test's own obsolescence and a prediction on calibrator degradation through protocol exposure. A Behavioral Failure Mode Catalog documents eight structural failure modes derived from systematic analysis of twelve formal first-contact interactions with six distinct large language model systems. A Pre-Screening Calibration Protocol provides three structural calibrators for pre-triaging systems. A Structural Audit Protocol provides a three-layer conversation guide for auditors. Empirical evidence is drawn from two documented interactions and from the aggregate summary of twelve formal interactions (DOI: 10.5281/zenodo.20241865; restricted corpus DOI: 10.5281/zenodo.20241597). CAMAF compliance: CS2 (Structural Transparency). All claims carry declared epistemic labels. Derivation chains are declared throughout.
Building similarity graph...
Analyzing shared references across papers
Loading...
Alexsandro Moura
Building similarity graph...
Analyzing shared references across papers
Loading...
Alexsandro Moura (Mon,) studied this question.
www.synapsesocial.com/papers/6a0d50bdf03e14405aa9cbb5 — DOI: https://doi.org/10.5281/zenodo.20277549
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: