In this article, we present a novel off-policy, safe reinforcement learning (RL) approach for nonlinear dynamical systems under input saturation that guarantees safe initialization, safe exploration, as well as safe learning of optimal control laws. First, to encourage preferable exploration near safety boundaries, important for integrating system behavior near the safety limits, we formulate a safe exploration approach as a robust control problem by considering an enlarged safe set based on input-to-state safe control barrier functions (ISSf-CBFs). These constraints are then incorporated into a quadratic programming (QP) optimization. We propose a novel -tuning law that adaptively enforces stricter safety constraints near the boundaries of the safe set and relaxes constraints deeper within the safe set, encouraging safety boundary-proximal exploration while maintaining forward invariance of the safe set. The proposed -tuning law safely accommodates aggressive, high-magnitude exploration noise, enabling efficient state-space exploration without compromising safety. Next, safe learning under saturation limits is guaranteed through a safety-aware cost function. We establish safety, optimality, and stability properties (novel) in a mathematically rigorous manner. Furthermore, the safe RL problem is solved in an off-policy manner, and neural networks are used to approximate the value function and the control policy. To that end, we establish a novel off-policy equation under input saturation. Finally, simulations demonstrate the efficacy of the proposed framework.
Building similarity graph...
Analyzing shared references across papers
Loading...
Mayank Shekhar Jha
Centre National de la Recherche Scientifique
Satya Marthi
Centre National de la Recherche Scientifique
Kyriakos G. Vamvoudakis
Georgia Institute of Technology
IEEE Transactions on Neural Networks and Learning Systems
Centre National de la Recherche Scientifique
Georgia Institute of Technology
Université de Lorraine
Building similarity graph...
Analyzing shared references across papers
Loading...
Jha et al. (Thu,) studied this question.
synapsesocial.com/papers/6a001ff2c8f74e3340f9b2ed — DOI: https://doi.org/10.1109/tnnls.2026.3688045
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: