An intelligent error correction system of sports action based on multi-modal deep learning is proposed. The system adopts "end-edge-cloud" three-level architecture, and the terminal layer collects RGB video and inertial data through low-cost equipment and preprocesses them, The edge layer deploys a lightweight model to realize low-delay real-time motion recognition and initial error correction feedback, The cloud layer is responsible for data aggregation, multimodal fusion model operation, personalized error correction strategy update and so on. In terms of key technology implementation, a multimodal spatiotemporal feature model integrating skeleton keypoints and RGB appearance features was constructed, and a dual-stream spatiotemporal graph convolutional network (DST-GCN) was designed, effectively enhancing the robustness and accuracy of action recognition in complex teaching environments. At the same time, a hierarchical error correction strategy based on kinematics rules and individual differences of students is proposed, which realizes the complete feedback process from error detection to demonstration guidance. In addition, in order to adapt to the low-cost hardware environment of universities, edge computing optimization schemes such as knowledge distillation and quantitative perception training are adopted, which significantly reduces the model volume, delay and energy consumption. The experiment constructs a self-built data set HUPAR, which covers six kinds of common sports events, including complex environmental factors such as illumination change and occlusion. The results show that the DST-GCN model has high accuracy in motion recognition in complex scenes, and the layered error correction strategy can significantly reduce the false alarm rate, and the feasibility and practicability of the optimized model in edge deployment are greatly improved.
Xiang et al. (Sun,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: