This data article presents a comprehensive maternal health risk dataset comprising 6058 samples (from an initially collected 6103 samples, after quality validation and outlier removal) collected through Internet of Things (IoT)-enabled Medical Cyber-Physical Systems (MCPS) across nine healthcare facilities in Bangladesh between February 2021 and January 2023. The dataset includes eight clinical features: age, body temperature, heart rate, systolic and diastolic blood pressure, body mass index (BMI), hemoglobin A1c (HbA1c), and fasting blood glucose levels, with expert- validated three-level risk classifications (high, mid, low). Data collection employed standardized IoT sensors with Raspberry Pi 4 controllers and medical-grade sensor arrays. Comprehensive quality validation removed 45 physio- logically implausible samples (0.74%), ensuring all values fall within clinically acceptable ranges. Comprehensive evaluation demonstrates dual-purpose utility(1) immediate clinical deployment for hos- pital decision support systems and (2) research baseline infrastructure for future temporal modeling studies-validated through rigorous experimental analysis. Traditional machine learning models achieved clinical-grade performance (defined as ≥95% accuracy with balanced precision/recall, consistent with established clinical decision support benchmarks), with XGBoost and Random Forest both reaching 99.34% accuracy (95% confidence interval: 98.84–99.75%), confirming exceptional data quality for immediate clinical deployment in hospital decision support systems. Feature importance analysis identified diastolic blood pressure (18.1%), body temperature (17.7%), and systolic blood pres- sure (17.0%) as primary predictive indicators. Additionally, recurrent neural network (RNN) models achieved 93.98% accuracy using synthetic temporal sequences (treating features as timesteps), establishing architectural baselines for future longitudinal studies. The 5.36% performance gap between XGBoost and Simple RNN ( p < 0.001, McNemar’s test) confirms the dataset’s cross-sectional structure is optimally suited for ensemble learning methods, while RNN experiments provide proof-of-concept infrastructure for temporal modeling when genuine longitudinal measurements become available. This dataset addresses the critical need for maternal health risk prediction in resource-limited set- tings, supporting both immediate clinical deployment and future research directions. The data demonstrate readiness for integration into clinical decision support systems (validated by 99.34% accuracy), temporal modeling research infrastructure (93.98% RNN baseline), and healthcare technology development. All data underwent rigorous quality control with physiologically validated ranges, expert risk classification, and ethical approval (DUET-IRB-2021–034), making it suitable for academic research, clinical implementation, and public health policy formulation in developing countries.
Hossain et al. (Thu,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: