Multi-radar fusion is fundamental for robust, all-weather perception for diverse applications. However, current fusion paradigms face structural and computational bottlenecks. Traditional statistical frameworks suffer from an explosion of dimensional calculation, where computational complexity scales with the number of active sensor nodes. Concurrently, existing statistical and deep learning fusion models exhibit systemic brittleness; their rigid topological binding to predefined sensor counts leads to a drop in performance during sensor dropouts. Furthermore, generic attention mechanisms suffer a phenomenological mismatch with radar signals, neglecting the spatial features of radar targets and leading to false alarms. To overcome these limitations, we propose RadarsBEV, a scalable end-to-end multi-radar detection framework. By decoupling per-sensor feature extraction from the central spatial fusion process, RadarsBEV achieves permutation invariance. This design breaks the scalability limit and enables graceful degradation utilizing residual nodes without system downtime. Crucially, we introduce a physics-aware Gaussian cross-attention mechanism. By guiding sparse feature sampling through predicted two-dimensional Gaussian target geometry, this mechanism decouples attention weights from clutter signal. Extensive experiments on high-fidelity simulations and real-world datasets demonstrate that RadarsBEV achieves better detection performance. Notably, the framework exhibits robust configuration zero-shot generalization, adapting to entirely unseen spatial layouts and degraded operational environments without fine-tuning.
Guo et al. (Thu,) studied this question.