Analisis Kelayakan Pembiayaan Anggota Koperasi Dengan Metode Komparasi Algoritma K-Nearest Neighbors Dan Naive Bayes (Studi Kasus Pada KSP. XYZ)

Authors

  • Rafi Lutfansyah Teknik Informatika, Universitas Pamulang

Keywords:

K-Nearest Neighbors, Naive Bayes, CRISP-DM, classification, loan eligibility

Abstract

Savings and Loan Cooperatives often face the challenge of defaulted loans, which pose risks to financial stability and member trust. This study aims to compare the performance of the K-Nearest Neighbors (KNN) and Naive Bayes algorithms in classifying loan eligibility using the CRISP-DM (Cross-Industry Standard Process for Data Mining) approach. A case study was conducted on member financing data to identify a more accurate classification model to minimize loan defaults. The CRISP-DM methodology encompasses business understanding, data analysis, data preparation, modeling, evaluation, and deployment. The results show that KNN achieved the highest accuracy rate of 92.86%, while Naive Bayes only reached 85.71%. Additionally, KNN outperformed Naive Bayes in terms of precision and recall. Thus, KNN was selected as the optimal model to assist cooperatives in predicting loan eligibility. The implementation of this model is expected to improve financing efficiency, reduce default risks, and strengthen data-driven decision-making in cooperatives.

References

[1] M. H. Effendy, D. Anggraeni, Y. S. Dewi, and A. F. Hadi, “ Classification of Bank Deposit Using Naïve Bayes Classifier (NBC) and K –Nearest Neighbor ( K -NN) ,” Proc. Int. Conf. Math. Geom. Stat. Comput. (IC-MaGeStiC 2021), vol. 96, pp. 163–166, 2022, doi: 10.2991/acsr.k.220202.031.

[2] J. W. Iskandar and Y. Nataliani, “Perbandingan Naïve Bayes, SVM, dan k-NN untuk Analisis Sentimen Gadget Berbasis Aspek,” J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 5, no. 6, pp. 1120–1126, 2021, doi: 10.29207/resti.v5i6.3588.

[3] A. Azevedo and M. F. Santos, “KDD, SEMMA and CRISP-DM: A parallel overview,” Wiley Interdiscip. Rev. Data Min. Knowl. Discov., vol. 11, no. 1, p. e1398, 2021, doi: https://doi.org/10.1002/widm.1398.

[4] F. Pedregosa et al., “Scikit-learn: Machine learning in Python,” J. Mach. Learn. Res., vol. 12, pp. 2825–2830, 2011, doi: 10.5555/1953048.2078195.

[5] A. Kumar and P. K. Mallick, “Heart disease prediction using hybrid ensemble classifier,” Int. J. Eng. & Technol., vol. 7, no. 3.12, pp. 106–109, 2018, doi: 10.14419/ijet.v7i3.12.15993.

[6] P. Narayan, G. Jothi, and B. Mani, “A comparative analysis of machine learning algorithms for credit scoring,” J. Comput. Theor. Nanosci., vol. 17, no. 9, pp. 4307–4312, 2020, doi: 10.1166/jctn.2020.9110.

[7] N. Sharma, R. Mishra, and S. Kaur, “Deployment of a machine learning model for credit risk assessment,” Procedia Comput. Sci., vol. 171, pp. 1048–1055, 2020, doi: https://doi.org/10.1016/j.procs.2020.04.112.

[8] M. A. Borg, J. A. Briffa, and J. Buhagiar, “Practical deployment of machine learning models in real-world applications,” AI Ethics, vol. 2, pp. 123–133, 2021, doi: https://doi.org/10.1007/s43681-021-00064-2.

[9] H. Henderi, T. Wahyuningsih, and E. Rahwanto, “Comparison of Min-Max normalization and Z-Score Normalization in the K-nearest neighbor (kNN) Algorithm to Test the Accuracy of Types of Breast Cancer,” Int. J. Informatics Inf. Syst., vol. 4, no. 1, pp. 13–20, 2021, doi: https://doi.org/10.47738/ijiis.v4i1.73.

[10] A. A. Shujaaddeen, F. Mutaher Ba -Alwi, A. T. Zahary, and A. Sultan Alhegami, “A Model for Measuring the Effect of Splitting Data Method on the Efficiency of Machine Learning Models: A Comparative Study,” in 2024 4th International Conference on Emerging Smart Technologies and Applications (eSmarTA), IEEE, Aug. 2024, pp. 1–13. doi: 10.1109/eSmarTA62850.2024.10639022.

Downloads

Published

2024-12-30