Vehicle–infrastructure cooperative perception (VICP) extends the sensing capability of single-vehicle systems by integrating multi-source information from onboard and roadside sensors, thereby alleviating limitations in sensing range and field-of-view coverage. However, in complex urban environments, the robustness of such systems—particularly in terms of blind-spot coverage and feature representation—is severely affected by both static and dynamic occlusions, as well as distance-induced sparsity in point cloud data. To address these challenges, a 3D object detection framework incorporating point cloud feature enhancement and spatially adaptive fusion is proposed. First, to mitigate feature degradation under sparse and occluded conditions, a Redefined Squeeze-and-Excitation Network (R-SENet) attention module is integrated into the feature encoding stage. This module employs a dual-dimensional squeeze-and-excitation mechanism operating across pillars and intra-pillar points, enabling adaptive recalibration of critical geometric features. In addition, a Feature Pyramid Backbone Network (FPB-Net) is designed to improve target representation across varying distances through multi-scale feature extraction and cross-layer aggregation. Second, to address feature heterogeneity and spatial misalignment between heterogeneous sensing agents, a Spatial Adaptive Feature Fusion (SAFF) module is introduced. By explicitly encoding the origin of features and leveraging spatial attention mechanisms, the SAFF module enables dynamic weighting and complementary fusion between fine-grained vehicle-side features and globally informative roadside semantics. Extensive experiments conducted on the DAIR-V2X benchmark and a custom dataset demonstrate that the proposed approach outperforms several state-of-the-art methods. Specifically, Average Precision (AP) scores of 0.762 and 0.694 are achieved at an IoU threshold of 0.5, while AP scores of 0.617 and 0.563 are obtained at an IoU threshold of 0.7 on the two datasets, respectively. Furthermore, the proposed framework maintains real-time inference performance, highlighting its effectiveness and practical potential for real-world deployment.
Building similarity graph...
Analyzing shared references across papers
Loading...
Shiyang Yan
Queen's University Belfast
Yanfeng Wu
Zhennan Liu
Guizhou Institute of Technology
World Electric Vehicle Journal
Henan University of Science and Technology
Yutong (China)
Building similarity graph...
Analyzing shared references across papers
Loading...
Yan et al. (Tue,) studied this question.
synapsesocial.com/papers/69c4cd30fdc3bde44891929e — DOI: https://doi.org/10.3390/wevj17040164