What question did this study set out to answer?

The aim is to develop a real-time obstacle avoidance system for UAVs using fused data from photoelectric and nano-radar sensors.

May 29, 2026

A Transformer-Based Multimodal Adaptive Fusion System for UAV Obstacle Avoidance Integrating Photoelectric and Nano-Radar Sensors

Key Points

The aim is to develop a real-time obstacle avoidance system for UAVs using fused data from photoelectric and nano-radar sensors.
Used a Transformer architecture for multimodal adaptive fusion of sensor data.
Implemented modified YOLOv5s for processing photoelectric images and adapted PointNet for nano-radar point clouds.
Tested the system on a custom Unity 3D dataset to evaluate performance under various conditions.
Achieved a mean average precision (mAP) of 95.8% in ideal conditions.
Performance degradation was limited to under 6% under extreme interference, compared to over 30% for unimodal systems.
Demonstrated a 99.2% average obstacle avoidance success rate with 32.6 ms latency.

Abstract

The reliable operation of unmanned aerial vehicles (UAVs) in low-altitude economies requires robust obstacle avoidance, yet unimodal sensing fails under extreme lighting or weather. This paper presents a real-time obstacle avoidance system based on multimodal adaptive fusion of photoelectric and nano-radar sensors within a Transformer architecture. The system employs an end-to-end design with dual-stream heterogeneous feature extraction. A modified YOLOv5s processes photoelectric images for semantic features, while an adapted PointNet handles nano-radar point clouds for spatial geometry. A cross-modal multi-head selfattention mechanism dynamically fuses these features, overcoming the limitations of manually predefined modality weights. This design leverages the complementary nature of photoelectric sensors (high-resolution texture) and nano-radar (penetrating capability and precise depth), addressing nanoscale-level positioning challenges in dynamic environments. Experimental results on a custom Unity 3D dataset demonstrate that the system achieves a mean average precision (mAP) of 95.8% under ideal conditions. Notably, performance degradation under extreme interference (glare, backlight, rain, fog) is constrained to under 6%, compared to over 30% for unimodal systems. The end-to-end response latency is 32.6 ms on an NVIDIA Jetson Xavier NX edge device, with a 99.2% average obstacle avoidance success rate. By enabling deep feature interaction and dynamic adaptive weighting, the proposed system significantly enhances environmental robustness and realtime perception, providing a reliable hardware-software co-design solution for autonomous UAV navigation in complex low-altitude airspace.

Bookmark

A Transformer-Based Multimodal Adaptive Fusion System for UAV Obstacle Avoidance Integrating Photoelectric and Nano-Radar Sensors

Key Points

Abstract

Cite This Study