February 24, 2024

Analysis of various data imputation techniques for diabetes classification on PIMA dataset

Key Points

Key points are not available for this paper at this time.

Abstract

Methodologies for addressing missing data in classification tasks must be rigorously evaluated in light of the rapidly expanding field of healthcare informatics. Using the PIMA Indian Diabetes dataset, this research provides a thorough analysis of data imputation methods related to diabetes classification. We evaluate four popular imputation techniques: Multivariate Imputation by Chained Equations (MICE), k-Nearest Neighbours (KNN), Mean, and Median. These techniques are applied to a variety of machine learning classifiers including Decision Trees, Random Forest, Support Vector Classifier (SVC), and Gaussian Naive Bayes Classifier. Our objective is to provide an understanding of how these techniques influence the predictive accuracy of classifiers in the context of diabetes diagnosis.

Ask AI

Helpful

Bookmark

Cite This Study

Jain et al. (Sat,) studied this question.

synapsesocial.com/papers/68e77c7cb6db6435876f066e https://doi.org/https://doi.org/10.1109/sceecs61402.2024.10482050

Ask AI

Helpful

Bookmark