The development of intelligent volleyball training urgently requires artificial intelligence (AI) technology that can accurately capture the complex poses of athletes. However, current human pose estimation methods based on the Convolutional Neural Network (CNN) frequently suffer from fine-grained feature loss when processing volleyball scenes involving severe occlusion, rapid motion, and multi-person interactions. These methods also demonstrate insufficient utilization of semantic context information. Such limitations restrict their practical application in real training scenarios. To this end, this study proposes a model based on Contextual Hybrid Convolutional High-Resolution Network (CHC-HRNet). The model realizes the efficient capture of local details and global semantic associations of key points by integrating multi-scale feature parallel processing mechanisms, content-aware up-sampling modules, and innovative contextual dynamic awareness modules. Experimental results demonstrate that CHC-HRNet achieves 76.4% Average Precision (AP) at 256×192 resolution, representing a 1.73% improvement over baseline models. At 384×288 resolution, the model reaches 77.3% AP and 82.7% Average Recall (AR), outperforming comparison models by 1.07% and 1.85% respectively. The ablation experiment further confirms the critical role of the contextual dynamic awareness module in performance enhancement, highlighting its advantages in fine-grained information capture and global semantic understanding. This study promotes the cutting-edge exploration of deep learning in sports pose analysis. Meanwhile, it provides reliable technological innovations and solutions for building real-time and accurate intelligent training assistance systems, demonstrating the great potential of AI in improving the training efficiency of sports science.
Liu et al. (Thu,) studied this question.