Key points are not available for this paper at this time.
A reinforcement learning (RL) based approach is proposed for PID controller fine-tuning and parameter estimation for effective and accurate tracking of a helix trajectory considering realistic flight controller sampling times. RL exploits a Deep Deterministic Policy Gradient (DDPG) algorithm, which is an off-policy actor-critic method. The quadrotor model follows the Newton-Euler formulation and accounts for complete gyroscopic and drag effects. Training and simulation studies are performed using Matlab/Simulink. Performance evaluation and comparison studies are detailed between the hand-tuned, RL-based tuned, and RL-based full estimation of parameters approaches. Results show that full estimation of controller parameters achieves the smallest attitude and position errors, and that both RL-based strategies significantly improve tracking performance compared to the hand-tuned approach.
Sönmez et al. (Tue,) studied this question.