Key points are not available for this paper at this time.
Lyapunov design methods are used widely in control engineering to design controllers that achieve qualitative objectives, such as stabilizing a system or maintaining a systems state in a desired operating range. We propose a method for constructing safe, reliable reinforcement learning agents based on Lyapunov design principles. In our approach, an agent learns to control a system by switching among a number of given, base-level controllers. These controllers are designed using Lyapunov domain knowledge so that any switching policy is safe and enjoys basic performance guarantees. Our approach thus ensures qualitatively satisfactory agent behavior for virtually any reinforcement learning algorithm and at all times, including while the agent is learning and taking exploratory actions. We demonstrate the process of designing safe agents for four dierent control problems. In simulation experiments, we nd that our theoretically motivated designs also enjoy a number of practical benets, including reasonable performance initially and throughout learning, and accelerated learning. Keywords: Reinforcement Learning, Lyapunov Functions, Safety, Stability 1.
Perkins et al. (Sat,) studied this question.