Key points are not available for this paper at this time.
This is the Era where no. of customers are increases day by day in every business 1 and Customer have more than one choice in each and every aspect whether it is financial, governmental, organizational etc. In this project we are mainly discuss about the customer churn prediction in Banking sector. Customer churn is one of the problem of banking sector where industries are not able to hold their customers due to several fluctuating reasons such as better services at lower cost, bank location etc 2. Hence Maintaining a good relationship with customers is crucial because it costs more to attract new ones than it does to retain existing ones in today's market3. Through this Research we want to proposed the solution of this problem using Machine learning (ML) approach. In Our Research work we applied many ML or DL algorithms such as Linear Regression, Logistic Regression, SVM, Artificial Neural Network, Random Forest classifier etc on Churn Modelling Bank dataset to predict the Probability of customer who are going to be Exited. This prediction helps the banking sector to identify the factors that leads the customer to be Exited so that they are able to improve the relationship with the customers. At the end of our research work we finally reach to the conclusion that Random Forest classifier predicts more accurate result compared to other ML or DL algorithms. Random Forest Classifier predicts the result with the test accuracy of 86.05 % without handling imbalanced data, 95.16% after oversampling with duplicate data, 89.548% after SMOTE oversampling, 74.69% after under sampling. A significant limitation of our model is its lack of training on real-time data. Instead, it relies solely on the Kaggle Bank churn dataset for its learning. Real-time data often presents dynamic and evolving patterns that might differ from those captured in the static dataset. The result and performance of the models may be further improved by using some other algorithms or by increasing the no. of hidden layers in ANN or by using Real Time Dataset.
Soni et al. (Sat,) studied this question.