What question did this study set out to answer?

The aim is to establish a framework for engineering trust in AI systems by improving the reliability of confidence scores.

May 13, 2026Open Access

Verified Autonomy: A Field Guide to Engineering Trust in AI Systems

Read Full Paperexternally

Key Points

The aim is to establish a framework for engineering trust in AI systems by improving the reliability of confidence scores.
Presentation of Verified Autonomy, a nine-layer defense-in-depth architecture for trust in AI systems.
The framework includes uncertainty reporting, architectural controls, and verification layers.
Each layer is tested with open-source reference implementations achieving high coverage.
Achieved 75% empirical coverage against a target of 90% for prediction intervals.
Integrity Clash vulnerability showcases gaps in independent verification layers.
Provided 475 tests at 98 to 100% coverage across all architecture layers.

Abstract

Abstract Production AI systems routinely report confidence scores that bear no reliable relationship to their actual probability of being correct. A 110-layer ResNet reports an Expected Calibration Error of 12.75% on CIFAR-100 (Guo et al., 2017). NREL’s baseline system-level solar forecasts, designed to produce 90% prediction intervals, achieve only 75% empirical coverage in held-out evaluation (Moradi et al., 2026). Downstream decisions, including whether to auto-process or escalate, whether to discharge a patient, and how to size a trading position, depend on these scores. This paper presents Verified Autonomy, a nine-layer defence-in-depth architecture for engineering trust into AI systems in production. The framework is organised into three tiers: uncertainty reporting (inverse confidence weighting, outlier detection, visible failures), architectural controls (calibration and conformal prediction, deterministic guardrails, retrieval-augmented generation as explainability), and verification (adversarial testing, cryptographic audit trails, formal verification). The central argument is that trust is not a feature of any single technique. It is an emergent property of a layered architecture where each layer compensates for the failure modes of the others. We ground this argument in the Integrity Clash vulnerability (Nemecek et al., 2026), which demonstrates that independent verification layers that do not cross-validate each other create exploitable gaps. All nine layers are accompanied by open-source reference implementations totalling 475 tests at 98 to 100% coverage, each runnable in isolation with a single command. The architecture has been extended to govern the transfer points between systems, not only the outputs of individual systems, with maritime intelligence as the worked example. The full code, this preprint, and the practitioner-facing field guide companion are available at https://github.com/antnewman/verified-autonomy under CC BY 4.0 (content) and MIT (code).

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Ant Newman

Rodrigues Economic Chamber and Industry

Shanti Greene

Central Connecticut State University

Malia Hosseini

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Verified Autonomy: A Field Guide to Engineering Trust in AI Systems

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study