This study proposes a novel hybrid deep learning (DL) architecture to improve the accuracy of image feature extraction and action recognition in sports videos. The core approach integrates the Visual Geometry Group (VGG) network with the Inception model. The capability of the VGG network is leveraged to capture fine-grained spatial features, and the Inception model’s multi-scale convolution capacity is used to generate richer hierarchical feature representations. Experimental results demonstrate superior performance, achieving a recognition accuracy of 90.7%, with a precision of 91% and an F1-score of 0.906. In addition, temporal segmentation analysis reveals that a 48-frame time window yields the optimal recognition accuracy of 83.7% for continuous video streams. These findings confirm that the proposed framework effectively addresses the limitations of conventional methods and provides a robust solution for sports video analysis. This study establishes a valuable theoretical foundation and practical methodology for applying advanced DL techniques in competitive sports training and performance evaluation.
Zhang et al. (Thu,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: