What question did this study set out to answer?

May 16, 2026Open Access

Development of an advanced maternal health risk detection system using enhanced XGBOOST and blending models

Key Points

The aim is to design and test a machine learning model to predict maternal health risks early in pregnancy.
Utilized UCI Maternal Health Risk dataset (n=1015) with 80% training and 20% testing split.
Applied SMOTENN resampling on training data and performed internal cross-validation for hyperparameter tuning.
Developed optimized XGBoost, blending, and hybrid stacking models for risk prediction.
The hybrid stacking model achieved a ROC-AUC of 0.911 and an accuracy of 80% on the test set.
It demonstrated a high sensitivity for detecting high-risk cases with a recall of 0.85.
Brier score for probability mean squared error was 0.07, indicating good probability calibration.

Abstract

Objective Mother’s health risks should be identified early so that the outcome of the pregnancy can be enhanced and the complications experienced during pregnancy can be minimized. This paper will design and test a leakage-regulated hybrid machine learning model to predict maternal health risk using the optimized ensemble models. Methods It used the publicly available UCI Maternal Health Risk dataset (n = 1015). A fixed random seed was used to stratify the dataset to 80% training and 20% independent testing subsets (42). SMOTENN resampling was only done to the training data to avoid data leakage. Internal cross-validation was resorted to as a means of hyperparameter tuning. We came up with the optimized XGBoost, blending, and hybrid stacking models. The performance of a model was measured in terms of accuracy, precision, recall, F1 score, ROC-AUC, confusion analysis, and probability mean squared error (Brier score). Results The hybrid stacking model had a ROC-AUC of 0.911 and general accuracy of 80 percent over the independent test set. The model proved to be very sensitive with high-risk cases (recall = 0.85). The probability mean squared error (Brier score) was 0.07, which is good probability calibration. The hybrid framework proposed performed better in terms of discriminative capability as compared to baseline models (logistic regression, random forest, and SVM). Conclusions The suggested leakage-sensitive hybrid ensemble framework offers strong and clinically significant working outcomes on maternal health risk forecasting. The results show the significance of effective validation techniques and probabilistic evaluation measures in healthcare machine learning systems.

Development of an advanced maternal health risk detection system using enhanced XGBOOST and blending models

Key Points

Abstract

Cite This Study