An online off-policy asynchronous real-time model reference tracking control (OOART-MRTC) algorithm is proposed and validated for unmanned aerial vehicles (UAVs) characterized by faulty actuation and parametric uncertainty. The optimal control problem is posed based on approximate dynamic programming (ADP) and reinforcement learning (RL) theory, using a virtual state-space representation constructed exclusively on input–output true system data, which exploits the observability theory. OOART-MRTC learns control by interacting with the system, starting from an initial stabilizing controller derived from an approximate uncertain model. Learning convergence and stability under the proposed adaptive behavior are analyzed. Since the learning iterations cannot update within a sampling period, an asynchronous mechanism is proposed for updating the controller parameters, leveraging real-time control and multi-tasking. The complexity associated with the resulting high-dimensional system is solved by efficient linear parameterization and validated on a realistic case study where three coupled double integrators describe the UAV attitude control.
Mircea-Bogdan Radac (Sun,) studied this question.