What question did this study set out to answer?

This research aims to develop accurate predictive models for diabetes using machine learning techniques.

December 19, 2025Open Access

Machine Learning Models for Diabetes Prediction: Logistic Regression, SVM, Random Forest, and Neural Networks

Key Points

This research aims to develop accurate predictive models for diabetes using machine learning techniques.
Utilized the Pima Indian Diabetes Dataset for training models.
Models evaluated include logistic regression, support vector machine, random forest, and neural network.
Performance metrics include accuracy, precision, recall, and area under ROC curve (AUC).
Logistic regression achieved the highest AUC of 0.84 for small medical data.
Key factors influencing diabetes prediction include blood glucose, body mass index, and age.
Logistic regression demonstrates advantages in stability and interpretability, suitable for small studies.

Abstract

This research is concerned with the application of machine learning techniques to predict the risk of diabetes. Diabetes is a very personal and medical problem. Development of accurate and efficient predictive models for diabetes is vital for its early screening and detection. This paper exploits the Pima Indian Diabetes Dataset to train models using individual clinical features like blood glucose level, body mass index, age, etc., and evaluate the predictive capabilities of four widely used supervised learning algorithms (logistic regression, support vector machine, random forest, and neural network). Accuracy, precision, recall and area under ROC curve (AUC) were primarily considered in this study to measure the performance of the models. Results: It is observed that Logistic Regression achieves the highest in AUC = 0.84 for small medical data. Additional experiments show that the factors that have most impact on diabetes are blood glucose, body mass index, and age. Unlike some complex models, logistic regression has substantial advantages with regard to stability and interpretability, which may make it more suitable for the prediction task in small clinical studies such as this. This work also emphasizes the significance of feature analysis and model selection for medical AI application, and provides empirical support for early warnings systems which predict diseases on the basis of data.

Read Full Paperexternally

اسأل الذكاء الاصطناعي

Bookmark

View Full Paper