April 23, 2024Open Access

VoxelFSD: voxel-based fully sparse detector with sparse convolution for 3D object detection

Puntos clave

Los puntos clave no están disponibles para este artículo en este momento.

Resumen

Abstract In recent years, convolutional neural networks (CNNs) in computer graphics have been transferred and applied to 3D object detection, achieving promising performance. However, challenges still exist in this area. Aiming at the problem of time-consuming dramatic increase of current voxel-based detectors in large-scale point cloud perception, this paper proposes a fully sparse detector, VoxelFSD, which is capable of real-time long-range perception. The model consists of three key components: (1) Parallel Convolutional Branches (PCB), which not only expands the model receptive field, but also effectively handles the impact of the loss of object center features on the results; (2) Sparse RPN head, which predicts the candidate boxes in a sparse manner rather than in a dense form, enabling the model to effectively handle long-range perception tasks; (3) ROI head with attention fusion module (AFM-ROI), which utilizes cross-attention to effectively fuse the extracted 3D backbone features and the compressed bev features in the second stage, further improving the model performance. Based on the above modules, we propose a single-stage lightweight detector, VoxelFSD-S, and a two-stage detector, VoxelFSD-T. Among them, VoxelFSD-S achieves a better performance than the previous voxel-based lightweight detectors, while VoxelFSD-T achieves a mAP of 81. 50\% on the KITTI test set. The code and the result are available at https: //github. com/seu-zwd/VoxelFSDhttps: //github. com/seu-zwd/VoxelFSD

Leer artículo completoexternamente

Me gusta

Guardar

Ver artículo completo