With the enhancement of technology facilitating the expansion of businesses and thoughts, more and more people are applying for loans for personal or business use. However, banks have limited assets, which limit the amount of loans that can be granted. Identifying the right persons to grant loans to can be a time-consuming process. Banks seek to grant loans to individuals who can repay the loan on time, enabling the bank to obtain maximum profits. This work aims to solve the loan default problem with minimum costs to banks. This work consists of five main stages: pre-processing, feature extraction, machine learning techniques, evaluation models, and performance analysis to select the best machine learning models. Then, two datasets with different features are used. The first dataset has five features, and the second contains eighteen features. We are splitting the datasets into various training percentages (40%, 50%, 60% and 70%). The rest of the dataset is used for testing using only the Weka application. KNN is applied with different cross-validations, such as 15, 10, and 5, and different numbers of nearest neighbours (1, 5, 10, and 15). For the first dataset, the highest accuracy is 97.47% with two cross-validation values, 15 and 10, in the 10 nearest neighbours. The KNN was also implemented on the second dataset to compute the highest accuracy, 88.21% in three cross-validation values (15, 10, and 5) with the 15 nearest neighbours. Then, logistic regression is applied to compare the results of the correct classification value computed at the highest value of 96.93% with the (70% training set for the first dataset. The highest accuracy was obtained at 88.32% after splitting the second dataset (40%) for training and the rest for testing.
Raad et al. (Mon,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: