What question did this study set out to answer?

The main aim is to solve the fuzzy algebraic Riccati equation for nonlinear systems with unknown dynamics.

February 28, 2026

Reinforcement Learning-Based Fuzzy Control for Nonlinear Systems With Unknown Dynamics via Parallel Composite Policy Iteration Scheme

Puntos clave

The main aim is to solve the fuzzy algebraic Riccati equation for nonlinear systems with unknown dynamics.
Developed a parallel composite policy iteration algorithm for fuzzy control.
Introduced an adaptive parameter to remove the initial stabilizing control policy requirement.
Proposed an online model-free algorithm to handle difficulties in dynamic information acquisition.
Implemented algorithms on single-link robot arm and quarter-car active suspension systems.
Successfully verified the effectiveness of the proposed algorithms through experiments.
Demonstrated improved performance in managing nonlinear systems compared to traditional methods.

Resumen

The problem of reinforcement learning (RL)-based fuzzy control for nonlinear systems with unknown dynamics via parallel composite policy iteration (PCPI) scheme is studied in this article. The main objective of this article is to solve the fuzzy algebraic Riccati equation (FARE), which is inherently complex and cannot be easily solved by traditional mathematical formulas. Policy iteration (PI) and value iteration (VI) algorithms proposed have been widely used to address this problem. However, these algorithms have the disadvantages of an initial stabilizing control policy, the persistent excitation (PE) condition, and huge amounts of data. To effectively alleviate these drawbacks, a novel PCPI algorithm is proposed in this article. Specifically, for each fuzzy subsystem, an adaptive parameter is designed to eliminate the requirement of an initial stabilizing control policy. In addition, an online model-free PCPI algorithm is proposed for the situation where the dynamic information of the fuzzy system is difficult to obtain. By substituting the stored historical data with online data, the PE condition is relaxed to the initial excitation (IE) condition. Concurrently, the corresponding algorithm can be executed independently and concurrently under each fuzzy rule, thereby fully exploiting the available computational resources. Finally, the effectiveness of the algorithms set forth in this article is verified through a single-link robot arm and quarter-car active suspension (QCAS) experiment.

Me gusta

Guardar

Cite This Study

Liu et al. (Thu,) studied this question.

synapsesocial.com/papers/69a285da0a974eb0d3c00c17 https://doi.org/https://doi.org/10.1109/tcyb.2026.3665244

Me gusta

Guardar