Unmanned Aerial Vehicle (UAV) tracking of ground moving targets holds significant applications in domains such as intelligent transportation, logistics distribution, and environmental monitoring, placing greater demands on efficient and stable path-planning methods for vehicular tracking. This study investigates a UAV path tracking approach based on a deep reinforcement learning algorithm, Proximal Policy Optimization (PPO). Starting from the kinematic characteristics of UAVs and ground vehicles, a 3D path planning model was constructed that considers spatial coordinates, velocity, and attitude constraints. A well-designed objective function—including tracking error minimization, energy optimization, and safety distance constraints—was incorporated. By designing the state space, action space, and reward function, the PPO algorithm is capable of adaptive learning in complex environments. Compared with traditional Artificial Potential Field (APF), Q-learning, and TD3 algorithms, PPO better balances exploration and exploitation and demonstrates stronger learning stability and global optimization capability in dynamic multi-obstacle scenarios. Simulation results show that PPO-based UAV path planning outperforms Q-learning and other comparative algorithms in terms of tracking accuracy, convergence speed, and robustness. In specific scenarios, Q-learning achieves a trajectory error of approximately 1 m, TD3 and APF exhibit errors around 0.3 m with noticeable oscillations, and PPO achieves an error of about 0.2 m. The UAV can follow the vehicle trajectory smoothly, with a more continuous path and rapidly converging, stable error curves, indicating the promising application potential of PPO in intelligent UAV control. The PPO-based UAV-tracking path planning method effectively enhances the UAV’s intelligent decision-making and path optimization capabilities, providing new technical approaches and a research foundation for intelligent UAV traffic and cooperative control systems.
Qiao et al. (Thu,) studied this question.