Key points are not available for this paper at this time.
Formal synthesis of safe controllers is essential for safety-critical cyber-physical systems. In this paper, we propose a novel counterexample guided approach for synthesizing safe controllers of nonlinear systems using Bayesian optimization enhanced reinforcement learning, to improve the efficiency of the training process while ensuring safety property. First, we utilize the control barrier function technique to establish a constrained Markov decision process, which enables us to learn an initial controller with minimal safety violations. We then design a counterexample guided policy refinement using Bayesian optimization, to fine-tune the initial controller based on the failure trajectories. Finally, we suggest a compensatory mechanism to correct the tuned controller to guarantee the safety property. We implement the CEGRLPR tool and evaluate its performance over a set of benchmarks. The experimental results demonstrate the effectiveness and efficiency of our approach.
Jin et al. (Thu,) studied this question.