Key points are not available for this paper at this time.
Estimating 3D hand poses from single RGB images is essential in real applications. However, the random use of single and two hands, the similarity and occlusion between hands, and the lack of depth clues make it challenging. To address this issue, we propose a framework for two-hand instance segmentation and pose estimation based on attention-induced separation. The framework first extracts hand joint heatmaps from images, which are then used as spatial attention to fuse with the input image along the channel dimension to implement hand instance segmentation. Subsequently, hand joint heatmaps and hand masks are combined to provide denser spatial attention and fuse with the input image along the channel dimension again for hand separation. Finally, this five-channel image is used for single-hand pose estimation. We extend the canonical 21-joint hand model to a 128-joint one to provide more effective hand-joint heatmap attention. Moreover, we utilize prior knowledge implied in the hand skeleton to help generate biomechanically feasible hand poses. Experimental results indicate that our framework outperforms state-of-the-art methods in the generalization ability of single-and two-hand pose estimation.
Building similarity graph...
Analyzing shared references across papers
Loading...
Digang Sun
Ping Zhang
University of Electronic Science and Technology of China
South China University of Technology
Building similarity graph...
Analyzing shared references across papers
Loading...
Sun et al. (Thu,) studied this question.
synapsesocial.com/papers/68e6e9adb6db643587664fe4 — DOI: https://doi.org/10.36227/techrxiv.171341028.82821498/v1
Synapse has enriched 3 closely related papers on similar clinical questions. Consider them for comparative context: