May 23, 2024Open Access

Research on multi-source data fusion target detection algorithm based on adaptive multi-scale and dynamic feature extraction

Puntos clave

Los puntos clave no están disponibles para este artículo en este momento.

Resumen

Abstract To solve the problem of LiDAR's low accuracy in detecting similar objects and distant small targets, we designs a complementary 3D object detection network for cameras and lidar, the Multi-scale Dynamic Feature Voxel to Point (MDVP-RCNN). MDVP-RCNN is a two-stage 3D object detection network that uses point clouds as nodes, integrating point cloud features and image information onto the point cloud. In the first stage of MDVP-RCNN, the raw point cloud is downsampled to a fixed number of key points via Farthest Point Sampling (FPS), then sparse convolutions and deformable convolutions are used as the backbone network for voxel feature extraction. A dual-channel attention mechanism is introduced in the bird's-eye view (BEV), sequentially learning the essential characteristics of the pseudo-2D image and compensating for the lost features during the 2Dization of the point cloud. In the second stage, a feature aggregation module combines the color information of the image with the point cloud information in a weighted manner. Experimental results show that the network performs excellently on small targets, with Average Precision (AP) of 61.76%, 67.66%, and 82.36% respectively achieved for pedestrian, cyclist, and car.Code is available at https://github.com/3623687277/MDVP-RCNN

Leer artículo completoexternamente

Preguntar a la IA

Me gusta

Guardar

Ver artículo completo