What question did this study set out to answer?

The central aim is to develop a robust BEV perception framework that integrates 4D radar and camera technologies to improve performance under adverse weather conditions.

March 21, 2026Open Access

Robust BEV Perception via Dual 4D Radar–Camera Fusion Under Adverse Conditions with Fog-Aware Enhancement

Puntos clave

The central aim is to develop a robust BEV perception framework that integrates 4D radar and camera technologies to improve performance under adverse weather conditions.
Integration of dual-source 4D millimeter-wave radar and multi-view camera images
Doppler-Aware Radar Encoder enhances motion-sensitive features
Fog-Aware Feature Denoising Module improves low-visibility consistency
Multi-Modal Temporal Fusion Module uses Transformer for motion modeling
Confidence-aware multi-task loss supervises multiple perception tasks.
Significant improvements in BEV segmentation accuracy compared to state-of-the-art methods
Enhanced detection robustness in adverse weather conditions
Improved motion stability during dynamic object interactions.

Resumen

Bird’s-eye-view (BEV) perception has emerged as a key representation for unified scene understanding in autonomous driving. However, current BEV methods relying solely on monocular cameras suffer from severe degradation under adverse weather and dynamic scenes due to limited depth cues and illumination dependency. To address these challenges, we propose a robust multi-modal BEV perception framework that integrates dual-source 4D millimeter-wave radar and multi-view camera images. The proposed architecture systematically exploits Doppler velocity and temporal information from 4D radar to model dynamic object motion, while introducing a deformable fusion strategy in the BEV space for accurate semantic alignment across modalities. Our design includes four key modules: a Doppler-Aware Radar Encoder (DARE) that enhances motion-sensitive features via velocity-guided attention; a Fog-Aware Feature Denoising Module (FADM) that suppresses modality inconsistency in low-visibility conditions through cross-modal attention and residual enhancement; a Multi-Modal Temporal Fusion Module (TFM) that encodes radar temporal sequences using a Transformer encoder for motion continuity modeling; and a confidence-aware multi-task loss that jointly supervises semantic segmentation, motion estimation, and object detection. Extensive experiments on the DualRadar dataset and adverse-weather simulations demonstrate that our method achieves significant gains over state-of-the-art baselines in BEV segmentation accuracy, detection robustness, and motion stability. The proposed framework offers a scalable and resilient solution for real-world autonomous perception, especially under challenging environmental conditions.

Leer artículo completoexternamente

Preguntar a la IA

Me gusta

Guardar

Ver artículo completo

Cite This Study

LI et al. (Thu,) studied this question.

synapsesocial.com/papers/69be35d76e48c4981c674555 https://doi.org/https://doi.org/10.3390/electronics15061284

Also Consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

Preguntar a la IA

Me gusta

Guardar

Ver artículo completo