Analysis For Bank Customer Churn Prediction Using Artificial Intelligence Based on Logistic Regression and Decision Tree

Authors

  • Tukiyat Universitas Pamulang
  • Suryatna Sacadibrata Universitas Pamulang
  • Taufiqur Rahman Universitas Pamulang

Abstract

Customer churn is a significant problem in the banking sector because it can reduce profitability and increase the cost of attracting new customers. This study aims to evaluate the effectiveness of Logistic Regression and Decision Tree algorithms in predicting bank customer churn and identifying factors that influence customers' decisions to stop using bank services. The data used comes from Kaggle with a total of 10,000 data including demographic information, transaction activity, customer satisfaction, and other risk factors. The data was analyzed through a cleaning stage to eliminate duplicates and missing data, then continued with initial data exploration to understand patterns
and correlations between variables. The analysis was carried out using Logistic Regression and Decision Tree models. The performance of both models was evaluated using metrics such as accuracy, precision, recall, and F1-score, and measured using ROC and AUC to assess the model's ability to distinguish between churned and non-churned customers. The results showed that 79.63% of customers remained active, while 20.37% experienced churn, which is a risk indicator for the sustainability of the bank's business. Model evaluation revealed that decision tree outperformed Logistic Regression on all metrics with an accuracy of 0.798 and 0.777, precision of 0.778 and 0.767, recall of 0.819 and 0.779, and F1-score of 0.798 and 0.773. However, Logistic Regression showed a higher AUC of 0.85 and 0.80. Decision tree was superior in detecting customers who actually churned. To reduce churn, it is recommended that banks offer loyalty programs and customize products for  high-risk customers, such as those with low balances or credit scores. The use of other models such as random forest and gradient boosting can be suggested to improve accuracy and provide deeper insights in reducing churn.
Keywords: bank customer churn, logistic regression, decision tree, artificial intelligence, churn prediction

Downloads

Published

2025-01-10