Aim/Purpose: This study systematically reviews machine learning techniques for predicting undergraduate student dropout in higher education, identifies related risk factors, explores methodological gaps in Machine Learning (ML)-based dropout prediction, and outlines directions for future research and institutional implementation. Background: Student dropout remains a persistent global challenge with significant academic and socioeconomic implications. Although many studies report high predictive accuracy for dropout models, there is still limited understanding of how these models operationalize educational theory, incorporate psychosocial and equity-related factors, and translate predictions into empirically validated interventions in real institutional settings. Methodology: The review followed Kitchenham’s methodology and PRISMA 2020 guidelines, using a PICO-C framework focused on undergraduate higher education. Searches from 2019 to 2024 across eight databases yielded 301 records. After screening using predefined inclusion/exclusion criteria, double review with Cohen’s kappa, and a 16-criterion weighted quality assessment (including automated keyword checks), 75 studies met all quality and relevance thresholds. Contribution: This systematic review synthesizes 75 quality‑assessed studies on ML‑based undergraduate dropout prediction, clarifying which algorithms currently dominate the field, which risk factors and theoretical frameworks are most frequently used, and where critical gaps remain in generalizability, equity, explainability, temporal modeling, and intervention validation. It is distinctive for its explicit quality‑weighting process, its focus on the actionability gap, and its mapping of emerging trends and outstanding needs in the field. Findings: The systematic review shows rapid growth in dropout prediction research from 2019 to 2024, reflecting increased institutional awareness. Academic performance and engagement indicators are the most used risk factors in prediction models, whereas psychosocial factors (including self-efficacy, sense of belonging, motivation, and resilience) remain significantly underrepresented. Ensemble methods (particularly Random Forest and XGBoost) dominate the algorithm landscape, consistently achieving accuracies of around 86–88%. However, most studies rely on single-institution datasets, do not account for temporal changes, and rarely evaluate the impact or fairness of interventions triggered by predictions. These limitations reduce the generalizability, equity, and practical utility of ML-based predictive models in higher education. Recommendations for Practitioners: Machine learning models should be viewed as decision-support tools, not standalone solutions. By combining predictive risk assessments with academic, financial, and engagement data, institutions can design multi-dimensional retention strategies, implement early-stage monitoring protocols, and establish clear intervention procedures that are evaluated for effectiveness and fairness. Recommendation for Researchers: Researchers should move beyond single-institution, static models toward longitudinal, cross-institutional designs that incorporate temporal dynamics, transfer learning, and domain adaptation. Future studies should integrate psychosocial, motivational, and equity-oriented variables; systematically apply fairness-aware machine learning and explainability (XAI) techniques; and empirically evaluate whether predictive systems improve retention through controlled intervention studies. Impact on Society: By clarifying how machine learning can reliably identify students at risk of dropout and exposing current limitations in fairness and implementation, this review supports the design of more equitable and effective retention policies in higher education. Improved early-warning systems and evidence-based interventions can reduce economic losses, mitigate social inequality, and enhance graduation rates, particularly in regions with historically high attrition. Future Research: Future research should focus on cross-institutional benchmark datasets, longitudinal cohort studies, and models that adapt to diverse institutional and regional contexts. There is a pressing need for studies that integrate psychosocial and structural factors, evaluate fairness across vulnerable groups, apply explainability frameworks as standard practice, and link predictive models to experimentally tested intervention strategies, moving the field from predictive accuracy toward demonstrable educational impact.
Building similarity graph...
Analyzing shared references across papers
Loading...
Gloria L Lopez-Muñoz
Carolina González-Serrano
Camilo Sanchez-Ferreira
Journal of Information Technology Education Research
University of Cauca
Building similarity graph...
Analyzing shared references across papers
Loading...
Lopez-Muñoz et al. (Thu,) studied this question.
www.synapsesocial.com/papers/6a1bd03d5783ba022b6fc12a — DOI: https://doi.org/10.28945/5776