What question did this study set out to answer?

The goal is to develop a control algorithm for continuous-time systems using output data while addressing challenges of partial observability.

January 26, 2026

Output-Feedback Control of Linear Continuous-Time Systems Using Discounted Inverse Reinforcement Learning

Key Points

The goal is to develop a control algorithm for continuous-time systems using output data while addressing challenges of partial observability.
Developed a state reconstruction method based on expert control data.
Introduced a model-free output-feedback DIRL algorithm to solve for unknown value functions.
Analyzed the convergence and uniqueness of the proposed algorithm.
The algorithm successfully recovers the expert control policy.
Demonstrated superior computational efficiency compared to state-of-the-art methods.

Abstract

This article proposes a novel discounted inverse reinforcement learning (DIRL) algorithm for linear quadratic (LQ) control of unknown continuous-time (CT) systems with partially observable states and an unknown discounted value function. Existing DIRL methods predominantly rely on full-state feedback, limiting their applicability to practical scenarios where only input-output data are available. To this end, a state reconstruction method is designed for the system controlled by an expert using the measured desired output. Based on this, a model-free output-feedback (OPFB) DIRL algorithm is presented to iteratively solve the unknown value function and the corresponding optimal OPFB control policy equivalent to the expert control policy. The convergence of the proposed algorithm and the nonuniqueness of solutions are rigorously analyzed. Finally, comprehensive simulations reveal the effectiveness of the proposed algorithm in recovering the expert control policy and its superior computational efficiency compared to state-of-the-art (SOTA) methods.

Mark Helpful

Bookmark

Relay