Structural Honesty Verification (SHV) is proposed as an under-theorized category of verification that addresses the consistency between what artificial systems claim at their surfaces and what their substrates actually deliver. Software supply-chain security has substantially matured through systems such as Sigstore, SLSA, and in-toto; this paper argues that a complementary verification problem — whether a system's surface claims remain consistent with its substrate behavior across versions, contexts, and deployment conditions — has not been articulated as a distinct verification category. SHV is proposed as substrate-independent and composes with formal verification, static analysis, and provenance attestation rather than replacing any of them. The paper presents three principal realizations of SHV at distinct maturity levels: the Munafiq Protocol (theoretical, with retrospective empirical engagement), a diagnostic framework for performed alignment in AI systems grounded in a four-process taxonomy and the structural correspondence between performed-alignment detection and the generative adversarial discrimination problem; the Furqan programming language (theoretical), which proposes seven compile-time primitives that elevate structural honesty from convention to compiler-enforced default; and furqan-lint (operational, v0.11.6 release snapshot), a static-analysis tool that ships with five checks across three language substrates (Python, Rust, Go) plus a parallel ONNX diagnostic family, and that verifies its own releases recursively against the same checks it imposes. Companion realizations are sketched at the file, information, agent, large-language-model architecture, financial-governance, and methodology layers. The SHV diagnostic vocabulary is mapped retrospectively to five publicly documented safety-critical failures — Therac-25, Boeing 737 MAX MCAS, Northeast Blackout 2003, Wirecard / EY, and Toyota Unintended Acceleration — and three documented software supply-chain attacks — SolarWinds Sunburst, Log4Shell, and XZ Utils backdoor. These mappings are vocabulary correspondences, not counterfactual prevention claims. Implications for AI safety, safety-critical software engineering, regulatory compliance, and the verification of large-language-model-generated code are discussed. The work is theoretical with one operational anchor (furqan-lint v0.11.6); empirical validation across instances and substrates is named as the principal load-bearing future work. This is Version 1 of the paper, dated May 11, 2026. The release-snapshot scope (front matter) governs the relationship between this paper and later releases of the operational anchor. AI assistance disclosure: this work was developed with collaborative analytical assistance from Claude (Anthropic), Grok (xAI), and Perplexity Computer; all claims and conclusions are the responsibility of the human authors.
Arfeen et al. (Tue,) studied this question.