With the deep application of artificial intelligence in the field of sports, accurate recognition of ball sport actions from large-scale video data becomes crucial for improving the scientific level of training and the intelligence of competition analysis. The current mainstream action classification and recognition methods face the following challenges in dynamic and complex scenes: traditional manual feature methods have weak generalization ability for complex backgrounds, lighting changes, and shooting angles, while single-stream convolutional neural networks can extract static apparent features, they are unable to fully capture key temporal dynamic features and long-term dependencies in actions. To address this issue, this study proposes a ball sport action classification and recognition system based on two-stream convolutional neural networks. The spatial branch extracts static poses through convolutional neural networks and long short-term memory networks, while the temporal branch captures detailed motion trajectories using 3D convolutional network modules. Finally, a multi-scale fusion strategy realizes hierarchical feature integration. Experimental results demonstrate that the ball sport action classification and recognition system achieves Top-1 accuracies of 96.8% and 92.3% on the ball action subsets of UCF101 and HMDB51, respectively. Ablation experiments show that the system maintains a Top-1 accuracy of 72.5% under backlight conditions, demonstrating excellent adaptability to scenes. These results indicate that the system has significantly improved the ability to distinguish complex ball sport actions and provides a new technical paradigm for intelligent analysis in competitive sports.
Jihong Li (Sun,) studied this question.