Key points are not available for this paper at this time.
The paper focuses on the problem of raw data fusion in neural networks based 3D object detection architectures. Here we consider the case of autonomous driving with data from camera and LiDAR sensors. Understanding the vehicle surroundings is a crucial task in autonomous driving since any subsequent action taken is strongly dependent on it. In this paper we present an alternative method of fusing camera image information with LiDAR poinclouds at a close to raw level of abstraction. Our results suggest that our approach improves the average precision of 3D bounding box detection of cyclists (and possibly other objects) in sparse point clouds compared to the baseline architecture without low-level fusion. The proposed approach has been evaluated on the KITTI dataset containing driving scenes with corresponding camera and LiDAR data. The long-term goal of our research is to develop a neural network architecture for environment perception that fuses multi-sensor data at the earliest stages possible, thus leveraging the full benefits of possible inter-sensor synergies.
Rövid et al. (Tue,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: