Type 1 diabetes (T1D) is a chronic autoimmune condition with a rising global incidence. Early prediction of disease onset and detection of preclinical progression are critical for timely intervention. Machine learning (ML) offers the ability to analyze complex, high-dimensional data and may improve risk prediction across different stages of T1D development. This systematic review evaluates the application and performance of ML models for predicting T1D onset and early disease-related outcomes. Following Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines, a structured search was conducted in PubMed, British Medical Journals, Scopus, IEEE Xplore, and Web of Science for studies published between 2021 and 2025. Eligible studies included those that developed or validated ML models for T1D prediction or early detection. Study selection, data extraction, and risk of bias assessment (using Prediction model Risk of Bias Assessment Tool (PROBAST)) were performed, and findings were synthesized narratively due to heterogeneity in study design, populations, prediction targets, and outcome measures. Fourteen studies were included, with sample sizes ranging from 32 to over 800,000 participants. ML approaches included logistic regression, random forests, support vector machines, and gradient boosting methods. Reported performance varied (area under the receiver operating characteristic curve (AUROC) 0.73-0.92), with prediction horizons spanning short-term outcomes (minutes to hours) to long-term disease onset (up to 10 years). However, study heterogeneity was substantial, and only three studies performed external validation. While most studies were rated as low risk of bias, several high-performing models were based on small samples or limited validation, raising concerns about overfitting and generalizability. ML models demonstrate potential for improving prediction of T1D onset and early disease-related outcomes, but current evidence is limited by variability in methods, inconsistent validation, and uncertain clinical applicability. Future research should prioritize large, prospective, and externally validated studies, with greater emphasis on model transparency, generalizability, and real-world implementation.
Aldeen et al. (Wed,) studied this question.