Diabetes is a significant public health issue, especially among low- and middle-income groups where the availability of clinical diagnosis services is scarce or unavailable. The focus of this work is to create a machine learning (ML)-based non-invasive, affordable, and scalable framework for the early screening of diabetes from binary health survey data. The method proposed balances healthcare inequities since community-level screening can be carried out without the reliance on laboratory-based tests. Six machine learning classification models, namely Random Forest, Logistic Regression, Decision Tree, Gradient Boosting, AdaBoost, and a Voting Classifier, were implemented on the 2015 Behavioral Risk Factor Surveillance System (BRFSS) dataset, which contained over 300,000 anonymized data records. Recursive Feature Elimination and Correlation-based feature selection approaches were used to optimize the performance and simplicity of the models. Label encoding, normalization via Z-score, and class balancing based on SMOTE were performed on the data. The models were trained and tested on stratified 5-fold cross-validation, targeting performance measures such as accuracy, recall, F1-score, and ROC-AUC. Out of all models, Voting Classifier with RFE provided highest recall rate (0.62), showing strong sensitivity towards detecting high-risk persons. This again supports the use of survey-only data for efficient identification of persons at risk of developing diabetes, under non-clinical conditions. Research makes a socially significant and reproducible AI framework available for facilitating preventive care equitably, especially in underserved contexts. It is aligned with the Sustainable Development Goals (SDG 3: Good Health and Well-being, and SDG 10: Reduced Inequalities), and it has pragmatic takeaways for policymakers, public health practitioners, and NGOs who are looking for scalable digital health applications.
Building similarity graph...
Analyzing shared references across papers
Loading...
Hamzah Asyrani Sulaiman
Norazlina Abd Razak
Siti Huzaimah Husin
International Journal of Research and Innovation in Social Science
Building similarity graph...
Analyzing shared references across papers
Loading...
Sulaiman et al. (Thu,) studied this question.
www.synapsesocial.com/papers/68d464ff31b076d99fa64a07 — DOI: https://doi.org/10.47772/ijriss.2025.908000497