Diabetes mellitus is a highly prevalent chronic disease; early diagnosis reduces severe complications. This work presents a diabetes prediction pipeline that combines metaheuristic feature selection with machine learning classification. We propose a hybrid Particle Swarm Optimization and Grey Wolf Optimizer (PSO-GWO) with alternating collaboration and an adaptive fitness function that adjusts to class balance, sample size, and dimensionality. Selected features are evaluated with random forest (primary), support vector machines, k-nearest neighbors, and logistic regression. The approach is assessed on three clinical datasets (Pima Indians, Frankfurt Hospital, Iraq) using stratified five-fold cross-validation. At the feature selection stage, the hybrid selector reaches 83.36% mean cross-validation accuracy while retaining about 74% of features on average. At the final classification stage, after random forest hyperparameter optimization on the selected features, the optimized random forest achieves 84.74% mean accuracy. Feature count is reduced by about 26% on average without loss of performance, improving interpretability and prospects for clinical use.
Ziane et al. (Fri,) studied this question.