What type of study is this?

This is a Experimental Study study.

synapse

⌘+K

synapse

⌘+K

September 30, 2025Open Access

Off Policy Lyapunov Stability in Reinforcement Learning

Key Points

The introduction of off-policy Lyapunov functions improves stability guarantees in reinforcement learning algorithms.
Integrating the off-policy approach with algorithms like soft actor critic and proximal policy optimization enhances performance metrics by offering data efficiency.
Simulations demonstrated the effectiveness of the proposed method on scenarios like an inverted pendulum and a quadrotor, showcasing stable learning outcomes.
This approach suggests a significant shift towards more robust reinforcement learning techniques with guaranteed stability.

Abstract

Traditional reinforcement learning lacks the ability to provide stability guarantees. More recent algorithms learn Lyapunov functions alongside the control policies to ensure stable learning. However, the current self-learned Lyapunov functions are sample inefficient due to their on-policy nature. This paper introduces a method for learning Lyapunov functions off-policy and incorporates the proposed off-policy Lyapunov function into the Soft Actor Critic and Proximal Policy Optimization algorithms to provide them with a data efficient stability certificate. Simulations of an inverted pendulum and a quadrotor illustrate the improved performance of the two algorithms when endowed with the proposed off-policy Lyapunov function.

Read Full Paperexternally

Mark Helpful

Bookmark

Relay

View Full Paper