Klasifikasi Rating Otomatis pada Dokumen Teks Ulasan Produk Elektronik Menggunakan Metode N-gram dan Naïve Bayes

Authors

  • Rahmawan Bagus Trianto Universitas An Nuur
  • Andri Triyono Universitas An Nuur
  • Dhika Malita Puspita Arum Universitas An Nuur

DOI:

https://doi.org/10.32493/informatika.v5i3.6110

Keywords:

Electronic product review, Classification, Naïve Bayes, n-gram, TF-IDF

Abstract

Online product ratings usually provide descriptive reviews and also reviews in the form of ratings. Likewise, what was done at the Lazada online store. Descriptive review can provide a clear view compared to a rating review to other potential buyers. However, in reality there is a mismatch between the description review and the rating given. This creates a lack of information for sellers as well as potential buyers. Automatic classification of buyer descriptive reviews is proposed in this study so that there is a match between descriptive reviews and rating reviews. This automatic classification descriptive review uses the Naive Bayes algorithm with n-gram feature extraction and TF-IDF word weighting. The results of this study obtained the best accuracy of 94.06%, a recall of 91.73% and precision of 90.71% in Bigram feature extraction. With this accuracy value it can be used as a reference or model for classifying product description reviews, so that the feedback process between sellers and buyers can run well.

References

Agastya, I. M. A. (2018). Pengaruh Stemmer Bahasa Indonesia Terhadap Peforma Analisis Sentimen Terjemahan Ulasan Film. Jurnal Tekno Kompak, 12(1), 18. https://doi.org/10.33365/jtk.v12i1.70

AL-Smadi, M., Jaradat, Z., AL-Ayyoub, M., & Jararweh, Y. (2017). Paraphrase identification and semantic text similarity analysis in Arabic news tweets using lexical, syntactic, and semantic features. Information Processing & Management, 53(3), 640–652. https://doi.org/10.1016/j.ipm.2017.01.002

Deolika, A., Kusrini, K., & Luthfi, E. T. (2019). Analisis Pembobotan Kata Pada Klasifikasi Text Mining. Jurnal Teknologi Informasi, 3(2), 179. https://doi.org/10.36294/jurti.v3i2.1077

Dhande, L. L., & Patnaik, P. G. K. (2014). Analyzing Sentiment of Movie Review Data using Naive Bayes Neural Classifier. International Journal of Emerging Trends & Technology in Computer Science (IJETTCS), 3(4), 313–320. Diambil dari www.ijettcs.org

Di Nunzio, G. M. (2014). A new decision to take for cost-sensitive Naïve Bayes classifiers. Information Processing & Management, 50(5), 653–674. https://doi.org/10.1016/j.ipm.2014.04.008

Farki, A., Baihaqi, I., & Wibawa, M. (2016). Pengaruh online customer review rating terhadap kepercayaan place di indonesia. Jurnal Teknik ITS, 5(2), A614–A619.

García Adeva, J. J., Pikatza Atxa, J. M., Ubeda Carrillo, M., & Ansuategi Zengotitabengoa, E. (2014). Automatic text classification to support systematic reviews in medicine. Expert Systems with Applications, 41(4), 1498–1508. https://doi.org/10.1016/j.eswa.2013.08.047

Haq, F. I. N., & Budi, E. (2019). Implementasi Naive Bayes Classifier untuk Prediksi Kepribadian Big Five pada Twitter Menggunakan Term Frequency-Inverse Document Frequency ( TF-IDF ) dan Term Frequency-Relevance Frequency ( TF-RF ) Program Studi Sarjana Ilmu Komputasi Fakultas Informatik. e-Proceeding of Engineering, 6(2), 9785–9795.

Harahap, D. A. (2018). Perilaku Belanja Online Di Indonesia: Studi Kasus. JRMSI - Jurnal Riset Manajemen Sains Indonesia, 9(2), 193–213. https://doi.org/10.21009/jrmsi.009.2.02

Hardilawati, W. L. (2020). Jurnal Akuntansi & Ekonomika. Jurnal Akuntansi & Ekonomika, 10(1), 89–98. Diambil dari http://ejurnal.umri.ac.id/index.php/jae

Lidya, S. K., Sitompul, O. S., & Efendi, S. (2015). Sentiment Analysis Pada Teks Bahasa Indonesia Menggunakan Support Vector Machine ( Svm ). Seminar Nasional Teknologi dan Komunikasi 2015, 2015(Sentika), 1–8.

Prasanti, A. A., Fauzi, M. A., & Furqon, M. T. (2018). Klasifikasi Teks Pengaduan Pada Sambat Online Menggunakan Metode N- Gram dan Neighbor Weighted K-Nearest Neighbor ( NW-KNN ). Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer (J-PTIIK) Universitas Brawijaya, 2(2), 594–601.

Pujadayanti, I., Fauzi, M. A., & Sari, Y. A. (2018). Prediksi Rating Otomatis pada Ulasan Produk Kecantikan dengan Metode Naïve Bayes dan N-gram. Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer (J-PTIIK), 2(11), 4421–4427.

Santoso, H. A., Rachmawanto, E. H., Nugraha, A., Nugroho, A. A., Setiadi, D. R. I. M., & Basuki, R. S. (2020). Hoax classification and sentiment analysis of Indonesian news using Naive Bayes optimization. Telkomnika (Telecommunication Computing Electronics and Control), 18(2), 799–806. https://doi.org/10.12928/TELKOMNIKA.V18I2.14744

Sapuhtra, B. D., Fauzi, M. A., & Rahayudi, B. (2019). Prediksi Rating Pada Review Produk Kecantikan Menggunakan Metode Semantic Orientation Calculator dan Regresi Linier. Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer, 3(5), 4477–4483.

Saputri, M. S., Mahendra, R., & Adriani, M. (2019). Emotion Classification on Indonesian Twitter Dataset. Proceedings of the 2018 International Conference on Asian Language Processing, IALP 2018, 90–95. IEEE. https://doi.org/10.1109/IALP.2018.8629262

Setyaji, M., Zidny, M., Prabowo, W. A., & Hertantyo, G. B. (2018). Naive Bayes dengan Ekstraksi Fitur N-gram dalam Mendeteksi Spam Ulasan Bahasa Indonesia. Proceedings on Conference on Electrical Engineering, Telematics, Industrial Technology, and Creative Media, 56–60.

Sheela, S. P. (2018). Sentiment Analysis and Prediction of Online Reviews with Empty Ratings. International Journal of Applied Engineering Research, 13(14), 11532–11539.

Sokolova, M., & Lapalme, G. (2009). A systematic analysis of performance measures for classification tasks. Information Processing and Management, 45(4), 427–437. https://doi.org/10.1016/j.ipm.2009.03.002

Trstenjak, B., Mikac, S., & Donko, D. (2014). KNN with TF-IDF based framework for text categorization. Procedia Engineering, 69, 1356–1364. Elsevier B.V. https://doi.org/10.1016/j.proeng.2014.03.129

Wahono, R. S., Herman, N. S., & Ahmad, S. (2014). A comparison framework of classification models for software defect prediction. Advanced Science Letters, 20(10–12), 1945–1950. https://doi.org/10.1166/asl.2014.5640

Downloads

Published

2020-09-30