This study proposes a multimodal deep reinforcement learning (MDRL) architecture, Multimodal Deep Reinforcement Learning-Deep Q-Network (MDRL-DQN), based on an improved Q-Learning algorithm. It aims to optimize Unmanned Aerial Vehicle (UAV) scheduling and execution capabilities in intelligent unmanned combat planning. By integrating an attention mechanism and an adaptive reward mechanism, the algorithm effectively fuses image data, sensor data, and intelligent information, enabling collaborative multimodal data processing. This improves task success rates, execution efficiency, and UAV deployment stability. Experimental results show that the improved MDRL-DQN algorithm demonstrates significant advantages in complex task scenarios. Specifically, in the long-distance dispersed defense (Scenario 1) and long-distance concentrated defense (Scenario 3), the task success rates reach 89.6% and 94.8%, respectively, outperforming other algorithms by several percentage points. Additionally, in Scenario 1, MDRL-DQN completes tasks in 720.8 s, which is 16.7% faster than Proximal Policy Optimization (PPO) at 865.3 s, highlighting its superior execution efficiency. These results indicate that the improved Q-Learning algorithm effectively enhances the efficiency and stability of unmanned combat tasks, providing new insights for intelligent planning in future unmanned operations.
Xu et al. (Mon,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: