An earlier paper, Gold-Standard AGI: Outer AGI Superalignment, introduced the concept of Gold-Standard AGI; that is, AGI (Artificial General Intelligence) that is both maximally-aligned and maximally-validated. The first of these properties --- alignment --- is traditionally decomposed into outer alignment (how do we define a final goal FGG that correctly states what we want? ), and inner alignment (how do we build an agent G that forever pursues FGG as intended? ) The earlier paper presented a complete, foundational, and self-contained theory of AGI, culminating in an implementation-neutral solution to the outer AGI alignment problem in the case that G is superintelligent (hence "superalignment"). Following on from the earlier paper, the present paper presents a solution to the inner AGI alignment problem for superintelligent G, including a novel cognitive architecture, and corresponding construction sequence, for superintelligent AGI. Following the example of the earlier paper, we adopt a pedagogic style throughout, in order that the paper might be accessible to less technical readers such as AGI policymakers.
Aaron Turner (Sun,) studied this question.