Los puntos clave no están disponibles para este artículo en este momento.
The 6D pose estimation methods are employed to ascertain the 3D position and 3D orientation of objects through image recognition. The end-to-end pose estimation method is designed to achieve accurate object poses directly. To further improve performance, we propose a learning-based multiple feature guidance network (MFG-Net) for 6D pose regression. This network simultaneously regresses dense 3D coordinate maps, visible segmentation maps, surface region maps, and 2D directional vector maps. By guiding with multiple dense features, we construct dense 2D-3D correspondences to more precisely regress the 6D pose parameters of objects. To enhance the robustness of the network model to tiny distortion or noise in the image, we construct a dual-channel regression framework guided by Gaussian blur to enforce pose consistency and improve the generalization. The skip structures are introduced in the encoder-decoder model to retain detailed information contained in low-level feature maps, thereby enhancing the accuracy of dense feature map predictions. Through improvements in multi-feature guidance, network structure, and data augmentation, we effectively enhance the pose estimation capabilities of the trained network, as evidenced by significant improvements in test results on the LINEMOD dataset.
Liu et al. (Fri,) studied this question.