This study aimed to compare four commonly used machine learning techniques, Support Vector Machine (SVM), XGBoost, Long Short-Term Memory (LSTM), and a Hybrid Transformer, for predicting non-contact lower limb and lower back injuries in female varsity soccer players. Special emphasis was placed on predictive performance and interpretability, with analyses designed to identify movement phases relevant to injury risk in applied sports settings. Twenty-five female soccer athletes completed pre-season and mid-season biomechanical assessments involving anthropometrics, performance and functional tasks (e.g., T-Balance, Y-Balance, L-hop), broad jump (BJ), countermovement jumps (CMJ), and squats captured via markerless 3D motion analysis. Each model was trained and evaluated using participant-level leave-one-subject-out cross-validation (LOSO-CV), ensuring that all observations from a given participant were held out together during testing. The Hybrid Transformer model demonstrated the highest performance (AUC = 0.647, accuracy = 66.0%), followed by LSTM (AUC = 0.611), XGBoost (AUC = 0.556), and SVM (AUC = 0.497). Attention analysis indicated that the Hybrid Transformer emphasized specific movement phases, including the landing and push-off phases of L-hop and the anterior and posteromedial reach phases of the Y-Balance in injured athletes. Overall, the Hybrid Transformer provided the best performance among the tested models while offering interpretable temporal attention patterns that highlight when biomechanical differences emerge between injured and non-injured athletes. These phase-specific insights may help inform targeted neuromuscular training and rehabilitation strategies in female soccer players.
Zhao et al. (Sun,) studied this question.