Predicting academic success is one of the most critical tasks in learning analytics. Solving this problem enables educational institutions to identify students struggling with their studies in a timely manner and apply appropriate pedagogical support and interventions. For predictive models to remain effective in the long term after being integrated into the educational process, it is essential to regularly monitor data shift within these models. This helps detect changes in the distribution of educational data that could lead to decreased forecasting accuracy and affect the interpretation of predictions.The article presents a methodology for analyzing data shift in academic performance forecasting models using Shapley values, incorporating a two-stage monitoring approach. In the first stage (during the educational process), while the forecasting model is predicting academic success, changes in predictor contributions to the final forecast are identified. In the second stage (after exams), when prediction accuracy can be evaluated, shifts in predictor contributions to the model’s loss function are analyzed. This approach enables the detection of different types of data shifts. Based on the analysis results, an educational institution can make timely decisions regarding model adjustments, retraining, or replacement. The proposed methodology was tested on models from the academic performance forecasting service Pythia, developed at Siberian Federal University. After the service was integrated into the educational process, forecasting on new data was already conducted over four academic semesters, revealing a decline in prediction accuracy. Through monitoring, several predictors responsible for covariate or concept shifts within the ensemble models were identified. As a result, recommendations for model adjustments were proposed.
Kustitskaya et al. (Sun,) studied this question.