This paper presents a physics-grounded alignment framework for finite adaptive systems derived from constraint structure rather than from values, policy, or reward design. Its central claim is that a system remains aligned only while it cannot become the final certifier of its own correction, and it develops a corresponding architecture, constitution, and stability framework from that premise.
Taylor Prather (Sat,) studied this question.