What question did this study set out to answer?

The aim is to develop a comprehensive theory for outer AGI alignment, particularly for superintelligent agents.

May 9, 2026Open Access

Gold-Standard AGI, Part One: Outer AGI Superalignment

Key Points

The aim is to develop a comprehensive theory for outer AGI alignment, particularly for superintelligent agents.
Proposes a foundational theory for AGI development that emphasizes outer alignment and validation.
Defines key concepts such as practical-maximal-alignment and practical-maximal-validation.
Adopts a pedagogic style to ensure accessibility for policymakers.
Introduces a proposed implementation-neutral solution to the outer alignment problem.
Suggests that the defined concepts could establish a certification standard for AGI.
Highlights the relevance of AGI governance through practical frameworks.

Abstract

The way in which AI (and, in particular, agentic superintelligent AGI) develops over the coming decades will determine the fate of all humanity for all eternity. In order to maximise the net benefit of AGI for all humanity, without favouring any subset thereof, we imagine a Gold-Standard AGI that is maximally-aligned and maximally-validated. The first of these properties --- alignment --- is traditionally decomposed into outer alignment (how do we define a final goal FGG that correctly states what we want? ), and inner alignment (how do we build an agent G that forever pursues FGG as intended? ) This paper presents a complete and foundational theory of AGI, culminating in a proposed implementation-neutral solution to the outer AGI alignment problem in the case that G is superintelligent (hence "superalignment"). Given the AGI alignment problem's profound relevance to AGI governance, we adopt a pedagogic style throughout in order that the paper might be accessible to less technical readers such as AGI policymakers. We envisage that the definitions of practical-maximal-alignment and practical-maximal-validation presented in this paper could form the basis of an international standard for Gold-Standard AGI certification, and that this international standard could form the basis for the formal certification of AGI by competent certification authorities, such that only formally-certified Gold-Standard AGI systems could then be lawfully deployed within the jurisdiction of each certification authority.

Read Full Paperexternally

اسأل الذكاء الاصطناعي

Bookmark

View Full Paper