What question did this study set out to answer?

The study aims to compare deep reinforcement learning with traditional design methods for optimizing hole-doped Hubbard clusters.

March 25, 2026Open Access

Deep reinforcement learning for autonomous control of hole-doped Hubbard clusters: A comparative study

Key Points

The study aims to compare deep reinforcement learning with traditional design methods for optimizing hole-doped Hubbard clusters.
Characterization of the half-filled Hubbard model with 1-hole
Comparison of systematic physics-guided design and deep reinforcement learning
Benchmarking across five different 3D lattice geometries
RL achieves human-competitive accuracy with R2 > 0.97
95.5% success rate on held-out tasks
RL exhibits 103−4× greater sample efficiency than grid search
91% few-shot generalization to unseen geometries

Abstract

Engineering electron correlations in quantum dot arrays demand navigation of high-dimensional, non-convex parameter spaces, where hole doping fundamentally alters the physics. We present a rigorous comparative study of two control paradigms for the 1-hole of half-filled Hubbard model: (i) systematic physics-guided design and (ii) autonomous deep reinforcement learning (RL) with geometry-aware neural architectures. While systematic analysis reveals key design principles—such as field-induced localization for trapping the mobile hole—it is computationally intractable for optimization. We demonstrate that an autonomous RL agent, benchmarked across five 3D lattices (tetrahedron to FCC), achieves human-competitive accuracy (R2 0.97) and 95.5% success on held-out tasks. Critically, the RL agent achieves this performance with 103−4× greater sample efficiency than grid search and outperforms other black-box optimization methods. Transfer learning demonstrates 91% few-shot generalization to unseen geometries. This work establishes autonomous RL as a viable, highly efficient framework for rapid optimization and non-obvious strategy discovery in complex quantum systems.

Deep reinforcement learning for autonomous control of hole-doped Hubbard clusters: A comparative study

Key Points

Abstract

Cite This Study