What type of study is this?

This is a Quantitative Study study.

October 5, 2025Open Access

A Sample-Level Entropy-Weighted Stacking Method for Diabetes Prediction

Key Points

SLE-Stacking achieved an impressive accuracy of 0.987, demonstrating significant improvements compared to traditional stacking methods.
The proposed method utilizes information entropy to address sample-level prediction uncertainty, leading to better model integration.
Experimental results confirm that SLE-Stacking outperformed both single models and conventional stacking methods across various evaluation metrics.
The robust design of SLE-Stacking enhances diabetes prediction models, suggesting a promising direction for medical applications in chronic disease management.

Abstract

Diabetes mellitus is a prevalent and serious chronic metabolic disease, and its global prevalence continues to rise, posing significant challenges to healthcare systems and economies. Early prediction and risk assessment of diabetes are crucial for timely clinical intervention, individualized treatment, and public health decision-making. While machine learning methods have made considerable progress in diabetes prediction, single models often struggle to balance predictive accuracy and robustness. Ensemble learning approaches, particularly stacking, have been shown to improve performance by leveraging the complementary strengths of multiple base learners. However, conventional stacking methods typically employ fixed weights or rely solely on a meta-learner, without fully accounting for sample-level differences in prediction uncertainty. This study is based on a publicly available diabetes dataset from Kaggle, consisting of 2,768 samples. The dataset was divided into training and testing sets at a 7:3 ratio and preprocessed using z-score normalization. We compared the performance of ten commonly used machine learning models (LR, SVM, KNN, NB, MLP, RF, DT, AdaBoost, XGBoost, and LightGBM), traditional stacking, and the proposed Sample-level Entropy-weighted Stacking method (SLE-Stacking). SLE-Stacking employs information entropy to quantify the predictive uncertainty of base learners at the individual sample level and dynamically assigns fusion weights accordingly, thereby achieving more robust integration. Experimental results show that SLE-Stacking outperformed both single models and traditional stacking across multiple evaluation metrics, including accuracy, precision, recall, F1-score, and AUC. Specifically, SLE-Stacking achieved an accuracy of 0.987, a precision of 0.993, an F1-score of 0.981, and an AUC of 0.990, with particularly notable improvements in F1-score and AUC. The proposed SLE-Stacking method effectively enhances the robustness and generalization capability of diabetes prediction models and provides a feasible new approach for the application of medical artificial intelligence in chronic disease risk assessment and auxiliary diagnosis.

Read Full Paperexternally

Ask AI

Helpful

Bookmark

View Full Paper