Chronic Kidney Disease (CKD) remains a significant global health challenge that progresses silently, without early symptoms, which makes it difficult to intervene and treat on time. Early detection is critical for effective management, and this study addresses the need for better predictive models using advanced machine learning techniques. The main aim of this study was to create a predictive model that could predict CKD with high accuracy and overcome the challenges of high-dimensional data, class imbalance, and overfitting. The study starts with extensive data preprocessing, including missing value handling using the Multiple Imputation by Chained Equations (MICE) method and class imbalance resolution using the Synthetic Minority Over-sampling Technique (SMOTE). The outlier detection and handling were performed using the Interquartile range (IQR) method. The Z-score normalization ensured that the data is standardized by scaling. Ridge Feature Selection (RFS) was applied for feature selection, which incorporates L2 regularization and Recursive Feature Elimination (RFE). This means only the most relevant features were kept. A hybrid classification model was then built using the SKL Hybrid Classifier, which integrated Support Vector Machine (SVM), K-Nearest Neighbors (KNN), and Logistic Regression, in order to provide accurate predictions of CKD. The model obtained an accuracy of 96%, with precision of 0.97 for CKD and recall of 0.99, indicating high sensitivity in the detection of CKD cases. Hyperparameter optimization through Optuna further fine-tuned the model, which obtained an accuracy of 99%. While the study primarily employs established techniques, the novelty lies in the systematic integration of data preprocessing, hybrid Ridge Feature Selection (RFS), and optimized stacking ensemble modeling into a single, reproducible pipeline. This integration improves generalization and interpretability within small clinical datasets such as CKD. Furthermore, future work will focus on external clinical validation and decision-analytic evaluation to confirm the model’s real-world applicability and impact in clinical decision-support environments.
Venkata Gurumurthy Reddy Saragada (Fri,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: