What type of study is this?

September 10, 2025

Monitoring data shift in the learning success forecasting model using Shapley values

Key Points

Monitoring data shift improved forecasting accuracy for academic success, enabling timely educational interventions.
Using Shapley values revealed how changes in predictor contributions impacted prediction outcomes, informing model adjustments.
The methodology involved a two-stage monitoring approach to analyze the effectiveness of predictive models over time.
Integration with the Pythia service showed a decline in prediction accuracy over four academic semesters, necessitating monitoring.

Abstract

Predicting academic success is one of the most critical tasks in learning analytics. Solving this problem enables educational institutions to identify students struggling with their studies in a timely manner and apply appropriate pedagogical support and interventions. For predictive models to remain effective in the long term after being integrated into the educational process, it is essential to regularly monitor data shift within these models. This helps detect changes in the distribution of educational data that could lead to decreased forecasting accuracy and affect the interpretation of predictions.The article presents a methodology for analyzing data shift in academic performance forecasting models using Shapley values, incorporating a two-stage monitoring approach. In the first stage (during the educational process), while the forecasting model is predicting academic success, changes in predictor contributions to the final forecast are identified. In the second stage (after exams), when prediction accuracy can be evaluated, shifts in predictor contributions to the model’s loss function are analyzed. This approach enables the detection of different types of data shifts. Based on the analysis results, an educational institution can make timely decisions regarding model adjustments, retraining, or replacement. The proposed methodology was tested on models from the academic performance forecasting service Pythia, developed at Siberian Federal University. After the service was integrated into the educational process, forecasting on new data was already conducted over four academic semesters, revealing a decline in prediction accuracy. Through monitoring, several predictors responsible for covariate or concept shifts within the ensemble models were identified. As a result, recommendations for model adjustments were proposed.

Mark Helpful

Bookmark

Relay