April 20, 2026Open Access

Traditional statistics and artificial intelligence-based prognostic models for predicting type 2 diabetes mellitus after gestational diabetes: a systematic review

Puntos clave

Los puntos clave no están disponibles para este artículo en este momento.

Resumen

Abstract Background Women with gestational diabetes (GDM) are at increased risk of developing type 2 diabetes (T2D). Prognostic models have been developed and evaluated, but their methodological quality and applicability remain inconclusive. Recent reviews with the latest search date of March 31, 2025, were conducted with major shortcomings, including failure to adhering best-practice guides and inappropriately pooling heterogeneous model performances. This systematic review aims to synthesise the methodological characteristics of existing prognostic models for T2D following GDM. Methods Five electronic databases were searched from inception to January 24, 2026. Prognostic models predicting T2D following GDM, regardless of study setting were included. Data extraction adhered to existing expert guidelines. Quality and applicability were assessed using the updated Prediction model Risk Of Bias ASsessment tool. Two reviewers independently screened and assessed quality, resolving disagreements through consensus and involvement of a third reviewer. Results Our updated review identified six more studies than the previous review. Our review identified 19 studies with 20 models, half from prospective cohorts (n = 18) mostly in hospital settings across North America, Europe, Australia, and Asia. Logistic regression model was most common (n = 9), followed by machine learning (n = 6), and Cox regression (n = 5). Internal and external validation were done in only 13 and 1 models, respectively. Discrimination was widely reported (Area Under the Curve (AUC) 0.67–0.92); while calibration, overall performance, and clinical utility measures were underreported. Only one study reported appropriate sample size determination. Maternal age, pregnancy fasting glucose, and BMI were common predictors. Risk of bias was generally low during development and evaluation phases, but applicability concerns were high in 60% of models. Conclusions While several models demonstrated acceptable performance and low concern in selected quality domains, generalizability and clinical utility remain limited due to high concerns in applicability and inconsistent reporting. Adherence to best practice guides such as TRIPOD + AI, external validation of models, and exploration of novel prediction modelling techniques are recommended to advance the reporting and application of risk tools for personalised medicine post GDM. This review is the first to apply the PROBAST + AI, enabling a comprehensive evaluation of quality and applicability compared to previous works in the field. Protocol registration PROSPERO CRD420251034657.

Leer artículo completoexternamente

Me gusta

Guardar

Ver artículo completo