Majors Determination for High School Students Using the Naïve Bayes Algorithm, C4.5 and the K-Nearest Neighbor Algorithm (Case Study: SMA 1 Barunawati Jakarta)

Authors

  • Yudisti Prayigo Permana
  • Taswanda Taryo
  • Makhsun Makhsun

Abstract

Education is the most important function in life to form a good mindset and also to help develop the potential in students to become better individuals and the knowledge gained can be useful for many people. The majors process is the most important aspect in determining the interests and talents of students to facilitate students in carrying out learning. The majors must be done carefully and be seen from various aspects so that there is no mistake in determining the majors because it will have an impact on students' academic scores. In the majoring process, there are several aspects that are used as material for consideration, namely, by looking at the academic scores of students obtained from academic tests and then comparing them with the results of psychological tests and questionnaires regarding the majors of interest, so it takes quite a long time to get the results of majors. The difficulty in the process of classifying majors is an obstacle for the school to calculate from each criterion because there is no major system capable of producing majors classification with a high degree of accuracy so that the results obtained are in accordance with the abilities and interests of students. This study aims to get the best results from three algorithms, namely, Naïve Bayes, C4.5 and the K-Nearest Neighbor algorithm to determine the classification of majors in order to create more interesting, active learning because the learning that students get is in according to their interests and talents. The classification method using Naïve Bayes is a classification method based on probability which is used to predict with the assumption that between one class and another are not interdependent. In addition, the method using the C4.5 algorithm functions to classify data that has numeric and categorical attributes and the K-Nearest Neighbor algorithm works based on the assumption that a data will have the same class or category as the surrounding data. From the results of the tests carried out in this research of 214 datasets, the Naïve Bayes algorithm method has a better accuracy rate than the C4.5 and K-Nearest Neighbor algorithms from the amount of data processed resulting in an accuracy value of 98.13%. The comparisons have been made using data random data with real data of 50, 100, 214, 300, 400 and 428 data and it can be concluded that the nave Bayes algorithm is suitable to be applied in this case because it has the highest level of accuracy and is stable and not affected by the amount of data being tested.

Keywords: Data Mining, Classification, Major, Naïve Bayes, C4.5, K-Nearest Neighbor

References

Adinugroho, Sigit & Arum Sari, Yuita. (2018). Implementasi Data Mining Menggunakan Weka. Universitas Brawijaya Press, ISBN 978-602-43-2445-2

Aditya Maulana Habibi, Reva Ragam Santika (2020). Implementasi Algoritma K-Nearest Neighbor dalam Menentukan Jurusan Menggunakan Metode Euclidean Distance Berbasis Web Pada SMP Setia Gama. SKANIKA, Vol. 3, No. 4, Juli 2020, 7-14, E-ISSN: 2721-4788

Ahmad Zainul Mafakhir, Achmad Solichin (2020). Penerapan Metode Naïve Bayes Classifier untuk Penjurusan Siswa Pada Madrasah Aliyah Al-Falah Jakarta. Fountain of Informatics Journal, Volume 5, No. 1, Mei 2020, ISSN: 3652-4313 (print) / 25485113 (online)

Bramer, M. 2007. Principles Of Data Mining. London: Springer-Verlag London Limited

Christiandita Rahayuningtyas, Dedy Satrio Winarso (2017). Implementasi Algoritma k-Nearest Neighbor untuk Penjurusan Siswa SMA. Cahayatech Vol.6, No. 02, September 2017 ISSN : 2302 – 2426

Endang Etriyanti, Dedy Syamsuar dan Yesi Novaria Kunang (2020). Implementasi Data Mining Menggunakan Algoritme Naïve Bayes Classifier dan C4.5 untuk Memprediksi Kelulusan Mahasiswa. Telematika. Vol. 13 No. 1, Februari 2020 pp. 56-67, e-ISSN 2242-4528, p-ISSN 1979-925X

Gani, Ruslan A. (1986). Bimbingan Penjurusan. Bandung : Angkasa, ISBN : 979-404-149-1

Sumarni Adi, Jurnal Mantik Penusa (2018). Prediksi dalam Penjurusan Siswa Baru Tingkat SMA Menggunakan Algoritma Naïve Bayes Classifier. Vol. 2, No. 2, Desember 2018, e-ISSN 2580-9741, p-ISSN 2088-3943

Yanti, L. (2021). Analisis Kinerja Pegawai Berdasarkan Aspek Kepribadian Diri. AKADEMIK: Jurnal Mahasiswa Ekonomi & Bisnis, 1(1), 27-33.

Juliati, F. (2021). The Influence Of Organizational Culture, Work Ethos And Work Discipline On Employee Performance. AKADEMIK: Jurnal Mahasiswa Ekonomi & Bisnis, 1(1), 34-39.

Agustine, C. (2021). ANALISIS RASIO LIKUIDITAS, RASIO PROFITABILITAS, DAN RASIO SOLVABILITAS UNTUK MENILAI KINERJA KEUANGAN PERUSAHAAN PADA PT. SURYA TOTO INDONESIA TBK PERIODE 2010-2018. AKADEMIK: Jurnal Mahasiswa Ekonomi & Bisnis, 1(2), 68-76.

Baeli, J. (2021). Analysis Of Tax Compliance Based On Psychological Factors And Tax Administration. AKADEMIK: Jurnal Mahasiswa Ekonomi & Bisnis, 1(3), 87-94.

Majid, I. K. (2021). Governance-Based Library Management, More Effective and Efficient. AKADEMIK: Jurnal Mahasiswa Humanis, 1(2), 68-76.

Karim, I. N. (2021). CORRELATION ANALYSIS: Are Religious Scavengers More Motivated?. AKADEMIK: Jurnal Mahasiswa Humanis, 1(2), 59-67.

Ulfah, M. (2021). Building Teacher Performance Based Islam Religious Values. AKADEMIK: Jurnal Mahasiswa Humanis, 1(1), 9-17.

Supriyanto, Catur. Purnama Parida. (2013). Deteksi Penyakit Diabetes Type II Dengan Naïve Bayes Berbasis Particle Swarm Optimization. Jurnal Teknologi Informasi, Volume 9 Nomor 2, Oktober 2013, ISSN 1414-9999

Zuleha (2020). Penentuan Jurusan Sekolah Menengah Atas Menggunakan Metode K-Nearest Neighbor Classifier Pada SMAN 2 Singingi. JuPerSaTeK, Vol. 3 No. 1, Juli 2020, hal. 199-206, ISSN : 2622-108X.

Downloads

Published

2022-11-15