What question did this study set out to answer?

The aim is to enhance 2D human pose estimation by addressing challenges like occlusion and background interference.

March 2, 2026Open Access

Robust 2D Human Pose Estimation with Parallel Graph–Attention Modeling and Entropy-Aware Feature Decoding

Key Points

The aim is to enhance 2D human pose estimation by addressing challenges like occlusion and background interference.
Developed PMNet, combining graph-based structural modeling and self-attention models.
Employed a criss-cross attention module to suppress irrelevant features.
Implemented an adaptive nonlinear fusion strategy for balancing information.
Utilized an error-compensated decoding method for accurate keypoint localization.
Achieved 92.42% PCKh@0.5 on MPII benchmark.
Attained 77.3% AP on COCO benchmark.
Demonstrated improved signal-to-noise ratios in outputs.
Showed more concentrated heatmap responses for better localization.

Abstract

Robust 2D human pose estimation remains challenging due to occlusion and background interference, which introduce substantial uncertainty into visual representations. This paper proposes PMNet, a Parallel Modeling Network that integrates explicit graph-based structural modeling and implicit self-attention-based semantic modeling through parallel pathways to jointly capture local dependencies and global contextual relationships among keypoints. From an information-theoretic perspective, occlusion and clutter can be interpreted as sources of increased representational entropy, and PMNet addresses this issue by progressively reducing uncertainty through complementary structural reasoning and attention-based information selection. The framework incorporates a criss-cross attention module to suppress irrelevant features, an adaptive nonlinear fusion strategy to balance complementary information across parallel branches, and an error-compensated decoding method to sharpen heatmap distributions and refine keypoint localization while maintaining efficiency. Extensive experiments on the MPII and COCO benchmarks demonstrate that PMNet achieves state-of-the-art or comparable performance, attaining 92.42% PCKh@0.5 on MPII and 77.3% AP on COCO. Ablation studies and qualitative visualizations further confirm the effectiveness of each component, showing improved signal-to-noise ratios and more concentrated heatmap responses. Overall, PMNet provides a robust and efficient pose estimation framework with strong potential for real-world applications such as surveillance and autonomous systems.

Robust 2D Human Pose Estimation with Parallel Graph–Attention Modeling and Entropy-Aware Feature Decoding

Key Points

Abstract

Cite This Study

Also Consider

Also Consider