Klasifikasi Rating Otomatis pada Dokumen Teks Ulasan Produk Elektronik Menggunakan Metode N-gram dan Naïve Bayes
DOI:
https://doi.org/10.32493/informatika.v5i3.6110Keywords:
Electronic product review, Classification, Naïve Bayes, n-gram, TF-IDFAbstract
Online product ratings usually provide descriptive reviews and also reviews in the form of ratings. Likewise, what was done at the Lazada online store. Descriptive review can provide a clear view compared to a rating review to other potential buyers. However, in reality there is a mismatch between the description review and the rating given. This creates a lack of information for sellers as well as potential buyers. Automatic classification of buyer descriptive reviews is proposed in this study so that there is a match between descriptive reviews and rating reviews. This automatic classification descriptive review uses the Naive Bayes algorithm with n-gram feature extraction and TF-IDF word weighting. The results of this study obtained the best accuracy of 94.06%, a recall of 91.73% and precision of 90.71% in Bigram feature extraction. With this accuracy value it can be used as a reference or model for classifying product description reviews, so that the feedback process between sellers and buyers can run well.References
Agastya, I. M. A. (2018). Pengaruh Stemmer Bahasa Indonesia Terhadap Peforma Analisis Sentimen Terjemahan Ulasan Film. Jurnal Tekno Kompak, 12(1), 18. https://doi.org/10.33365/jtk.v12i1.70
AL-Smadi, M., Jaradat, Z., AL-Ayyoub, M., & Jararweh, Y. (2017). Paraphrase identification and semantic text similarity analysis in Arabic news tweets using lexical, syntactic, and semantic features. Information Processing & Management, 53(3), 640–652. https://doi.org/10.1016/j.ipm.2017.01.002
Deolika, A., Kusrini, K., & Luthfi, E. T. (2019). Analisis Pembobotan Kata Pada Klasifikasi Text Mining. Jurnal Teknologi Informasi, 3(2), 179. https://doi.org/10.36294/jurti.v3i2.1077
Dhande, L. L., & Patnaik, P. G. K. (2014). Analyzing Sentiment of Movie Review Data using Naive Bayes Neural Classifier. International Journal of Emerging Trends & Technology in Computer Science (IJETTCS), 3(4), 313–320. Diambil dari www.ijettcs.org
Di Nunzio, G. M. (2014). A new decision to take for cost-sensitive Naïve Bayes classifiers. Information Processing & Management, 50(5), 653–674. https://doi.org/10.1016/j.ipm.2014.04.008
Farki, A., Baihaqi, I., & Wibawa, M. (2016). Pengaruh online customer review rating terhadap kepercayaan place di indonesia. Jurnal Teknik ITS, 5(2), A614–A619.
GarcÃa Adeva, J. J., Pikatza Atxa, J. M., Ubeda Carrillo, M., & Ansuategi Zengotitabengoa, E. (2014). Automatic text classification to support systematic reviews in medicine. Expert Systems with Applications, 41(4), 1498–1508. https://doi.org/10.1016/j.eswa.2013.08.047
Haq, F. I. N., & Budi, E. (2019). Implementasi Naive Bayes Classifier untuk Prediksi Kepribadian Big Five pada Twitter Menggunakan Term Frequency-Inverse Document Frequency ( TF-IDF ) dan Term Frequency-Relevance Frequency ( TF-RF ) Program Studi Sarjana Ilmu Komputasi Fakultas Informatik. e-Proceeding of Engineering, 6(2), 9785–9795.
Harahap, D. A. (2018). Perilaku Belanja Online Di Indonesia: Studi Kasus. JRMSI - Jurnal Riset Manajemen Sains Indonesia, 9(2), 193–213. https://doi.org/10.21009/jrmsi.009.2.02
Hardilawati, W. L. (2020). Jurnal Akuntansi & Ekonomika. Jurnal Akuntansi & Ekonomika, 10(1), 89–98. Diambil dari http://ejurnal.umri.ac.id/index.php/jae
Lidya, S. K., Sitompul, O. S., & Efendi, S. (2015). Sentiment Analysis Pada Teks Bahasa Indonesia Menggunakan Support Vector Machine ( Svm ). Seminar Nasional Teknologi dan Komunikasi 2015, 2015(Sentika), 1–8.
Prasanti, A. A., Fauzi, M. A., & Furqon, M. T. (2018). Klasifikasi Teks Pengaduan Pada Sambat Online Menggunakan Metode N- Gram dan Neighbor Weighted K-Nearest Neighbor ( NW-KNN ). Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer (J-PTIIK) Universitas Brawijaya, 2(2), 594–601.
Pujadayanti, I., Fauzi, M. A., & Sari, Y. A. (2018). Prediksi Rating Otomatis pada Ulasan Produk Kecantikan dengan Metode Naïve Bayes dan N-gram. Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer (J-PTIIK), 2(11), 4421–4427.
Santoso, H. A., Rachmawanto, E. H., Nugraha, A., Nugroho, A. A., Setiadi, D. R. I. M., & Basuki, R. S. (2020). Hoax classification and sentiment analysis of Indonesian news using Naive Bayes optimization. Telkomnika (Telecommunication Computing Electronics and Control), 18(2), 799–806. https://doi.org/10.12928/TELKOMNIKA.V18I2.14744
Sapuhtra, B. D., Fauzi, M. A., & Rahayudi, B. (2019). Prediksi Rating Pada Review Produk Kecantikan Menggunakan Metode Semantic Orientation Calculator dan Regresi Linier. Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer, 3(5), 4477–4483.
Saputri, M. S., Mahendra, R., & Adriani, M. (2019). Emotion Classification on Indonesian Twitter Dataset. Proceedings of the 2018 International Conference on Asian Language Processing, IALP 2018, 90–95. IEEE. https://doi.org/10.1109/IALP.2018.8629262
Setyaji, M., Zidny, M., Prabowo, W. A., & Hertantyo, G. B. (2018). Naive Bayes dengan Ekstraksi Fitur N-gram dalam Mendeteksi Spam Ulasan Bahasa Indonesia. Proceedings on Conference on Electrical Engineering, Telematics, Industrial Technology, and Creative Media, 56–60.
Sheela, S. P. (2018). Sentiment Analysis and Prediction of Online Reviews with Empty Ratings. International Journal of Applied Engineering Research, 13(14), 11532–11539.
Sokolova, M., & Lapalme, G. (2009). A systematic analysis of performance measures for classification tasks. Information Processing and Management, 45(4), 427–437. https://doi.org/10.1016/j.ipm.2009.03.002
Trstenjak, B., Mikac, S., & Donko, D. (2014). KNN with TF-IDF based framework for text categorization. Procedia Engineering, 69, 1356–1364. Elsevier B.V. https://doi.org/10.1016/j.proeng.2014.03.129
Wahono, R. S., Herman, N. S., & Ahmad, S. (2014). A comparison framework of classification models for software defect prediction. Advanced Science Letters, 20(10–12), 1945–1950. https://doi.org/10.1166/asl.2014.5640
Downloads
Published
Issue
Section
License
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).
Jurnal Informatika Universitas Pamulang have CC-BY-NC or an equivalent license as the optimal license for the publication, distribution, use, and reuse of scholarly work.
In developing strategy and setting priorities, Jurnal Informatika Universitas Pamulang recognize that free access is better than priced access, libre access is better than free access, and libre under CC-BY-NC or the equivalent is better than libre under more restrictive open licenses. We should achieve what we can when we can. We should not delay achieving free in order to achieve libre, and we should not stop with free when we can achieve libre.
Jurnal Informatika Universitas Pamulang is licensed under a Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)
YOU ARE FREE TO:
- Share : copy and redistribute the material in any medium or format
- Adapt : remix, transform, and build upon the material for any purpose, even commercially.
- The licensor cannot revoke these freedoms as long as you follow the license terms