April 18, 2024Open Access

Interacting two-hand instance segmentation and pose estimation based on attention-induced separation

Key Points

Key points are not available for this paper at this time.

Abstract

Estimating 3D hand poses from single RGB images is essential in real applications. However, the random use of single and two hands, the similarity and occlusion between hands, and the lack of depth clues make it challenging. To address this issue, we propose a framework for two-hand instance segmentation and pose estimation based on attention-induced separation. The framework first extracts hand joint heatmaps from images, which are then used as spatial attention to fuse with the input image along the channel dimension to implement hand instance segmentation. Subsequently, hand joint heatmaps and hand masks are combined to provide denser spatial attention and fuse with the input image along the channel dimension again for hand separation. Finally, this five-channel image is used for single-hand pose estimation. We extend the canonical 21-joint hand model to a 128-joint one to provide more effective hand-joint heatmap attention. Moreover, we utilize prior knowledge implied in the hand skeleton to help generate biomechanically feasible hand poses. Experimental results indicate that our framework outperforms state-of-the-art methods in the generalization ability of single-and two-hand pose estimation.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Digang Sun

Ping Zhang

University of Electronic Science and Technology of China

Actions

Institutions

South China University of Technology

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Cite this study

Sun et al. (Thu,) studied this question.

synapsesocial.com/papers/68e6e9adb6db643587664fe4 — DOI: https://doi.org/10.36227/techrxiv.171341028.82821498/v1

Also consider

Synapse has enriched 3 closely related papers on similar clinical questions. Consider them for comparative context:

Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour· 2017 · 2,619 citations
Efficient Hand Pose Estimation from a Single Depth Image· 2013 · 183 citations
Stacked Hourglass Networks for Human Pose Estimation· 2016 · 5,183 citations

Interacting two-hand instance segmentation and pose estimation based on attention-induced separation

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study

Also consider

Also consider