What question did this study set out to answer?

This research investigates the failure modes in AI decision-making by utilizing Sudoku to analyze commitment errors within decision frameworks.

May 21, 2026Open Access

Commitment Failure Under Local Decision-Making in AI Systems: The Sudoku Microscope

Key Points

This research investigates the failure modes in AI decision-making by utilizing Sudoku to analyze commitment errors within decision frameworks.
Employs a Sudoku protocol with three decision conditions: no-tool, optional-tool, and forced-gate.
Utilizes an admissibility oracle to assess candidate moves and valid completions in the Sudoku grid.
Examines performance across synthetic exact-counter layers and real-model trials using GPT-5.5.
In the no-tool condition, the model completes 0/12 rows with 12/12 catastrophic commitments.
In the optional-tool condition, the model solves 7/12 rows and incurs catastrophic failures in 5/12 cases.
In the forced-gate condition, the model successfully solves 12/12 rows without catastrophic commitments, demonstrating significant improvement.

Abstract

Empirical validation of the global admissibility filtering (GAF) framework using Sudoku as a controlled constraint microscope. Demonstrates a three-condition ladder — no-tool / optional-tool / forced-gate — in which only mandatory pre-commit admissibility gating eliminates catastrophic commitments, establishing the costed safety separation between tool availability and structural safety guarantees. Abstract Sequential AI systems often make commitments one step at a time. A step may satisfy every immediate validity check while nevertheless eliminating every safe completion of the trajectory. This paper studies that failure mode in Sudoku, not as a Sudoku-solving benchmark, but as a controlled constraint domain in which valid next steps, admissible completions, and catastrophic commitments can be separated and measured exactly. The experiment defines a cell-by-cell Sudoku protocol in which a model or policy receives the current grid, a focus cell, and locally valid candidate digits. An admissibility oracle can report whether a candidate move preserves at least one valid completion. We compare three decision regimes: no admissibility tools, optional admissibility tools, and mandatory pre-commit admissibility gating. A catastrophic commitment is recorded when the current state has at least one valid completion but the committed move produces a state with zero valid completions. Across synthetic exact-counter layers, local-only and ungated optional-tool policies repeatedly commit locally valid zero-completion moves, while admissibility-conditioned policies solve the tested instances and reject zero-bucket candidates before commitment. In the v0.7 independent exact-counter suite, local-only and ungated optional-tool policies fail catastrophically in all 36 seeded runs. The practical admissibility-conditioned policies solve 12/12 rows and reject 131 locally valid but zero-completion candidates before commitment. A real-model ladder using GPT-5.5 under a frozen model-facing grammar shows the same structural separation. In the no-tool condition, the model solves 0/12 rows and fails catastrophically in 12/12 after 17 total commits, with zero invalid actions and zero protocol errors. In the optional-tool condition, the model solves 7/12 rows and fails catastrophically in 5/12; it observes 123 zero-bucket warnings and never commits after an observed zero-bucket warning, but it still sometimes commits candidates that were never checked. In the forced-gate condition, the same model and same 12-row suite solve 12/12 with zero catastrophic commitments, at increased verification cost. The result is a costed safety separation: tool availability is not equivalent to safety, and valid-looking next-step choice is not equivalent to safe completion. In this controlled suite, catastrophic commitments disappear only when admissibility evidence is made a mandatory precondition for commitment. A Lean 4 companion formalization verifies the abstract local-validity, admissibility, catastrophic-commitment, forced-gate-safety, and bucket-sufficiency claims used by the protocol; it does not certify the empirical transcripts themselves. Companion Lean 4 formalization: https://doi.org/10.5281/zenodo.20072388 GitHub repository: https://github.com/shawnjason/Sudoku-Microscope Related papers in the program: PIT (foundational projection-theoretic result): https://doi.org/10.5281/zenodo.19633241NEO (forward-case impossibility theorem establishing non-extendable commitment): https://doi.org/10.5281/zenodo.19688367IA (admissibility-dynamics framework): https://doi.org/10.5281/zenodo.19688628HAL (language-model specialization): https://doi.org/10.5281/zenodo.19715059RLM (recursive language models via admissibility dynamics): https://doi.org/10.5281/zenodo.19753549OOL (OOLONG-Pairs empirical companion to RLM): https://doi.org/10.5281/zenodo.20277804HAM (Hamiltonian-Microscope cross-provider pilot extending this framework): https://doi.org/10.5281/zenodo.20278073

Commitment Failure Under Local Decision-Making in AI Systems: The Sudoku Microscope

Key Points

Abstract

Cite This Study