Key points are not available for this paper at this time.
3D LiDAR (light detection and ranging) semantic segmentation is important in scene understanding for many applications, such as auto-driving and robotics. For example, for autonomous cars equipped with RGB cameras and LiDAR, it is crucial to fuse complementary information from different sensors for robust and accurate segmentation. Existing fusion-based methods, however, may not achieve promising performance due to the vast difference between the two modalities. In this work, we investigate a collaborative fusion scheme called perception-aware multi-sensor fusion (PMF) to exploit perceptual information from two modalities, namely, appearance information from RGB images and spatio-depth information from point clouds. To this end, we first project point clouds to the camera coordinates to provide spatio-depth information for RGB images. Then, we propose a two-stream network to extract features from the two modalities, separately, and fuse the features by effective residual-based fusion modules. Moreover, we propose additional perception-aware losses to measure the perceptual difference between the two modalities. Extensive experiments on two benchmark data sets show the superiority of our method. For example, on nuScenes, our PMF outperforms the state-of-the-art method by 0.8% in mIoU.
Building similarity graph...
Analyzing shared references across papers
Loading...
Zhuangwei Zhuang
South China University of Technology
Rong Li
Soochow University
Kui Jia
Guangxi Medical University
South China University of Technology
Building similarity graph...
Analyzing shared references across papers
Loading...
Zhuang et al. (Fri,) studied this question.
synapsesocial.com/papers/69dff60caf3798be7f860340 — DOI: https://doi.org/10.1109/iccv48922.2021.01597