Sentimen Analisis Komentar Toxic pada Grup Facebook Game Online Menggunakan Klasifikasi Naïve Bayes


  • Renaldy Permana Sidiq Universitas Singaperbangsa Karawang
  • Budi Arif Dermawan Universitas Singaperbangsa Karawang
  • Yuyun Umaidah Universitas Singaperbangsa Karawang



Toxic comments, TF-IDF, Information Gain, Sentiment Analysis, Naive Bayes


Toxic comments are comments made by social media users that contain expressions of hatred, condescension, threatening, and insulting. Social media users who are on average still teenagers with a nature that still cannot be controlled completely becomes a matter of great concern when they comment, their comments can be studied as text processing. Sentiment analysis can be used as a solution to identifying toxic comments by dividing them into two classifications. Where the data used amounted to 1,500 taken from social media Facebook in the private group Arena of Valor community. The dataset is divided into 2 classes: toxic and non-toxic. This research uses Naive Bayes with TF-IDF transformation and Information Gain feature selection and use distribution ratio 80:20. It will be compared the results of the evaluation where Naive Bayes without transformation, using TF-IDF transformation, and TF-IDF using Information Gain feature selection. The results of the comparison of evaluations from confusion matrix that have been carried out obtained the best classification model is to use the ratio of training and testing data 80:20 with TF-IDF transformation resulting in an accuracy of 75%, precision of 63%, recall of 67%, and F-measure of 64%.


Brassard-Gourdeau, E., & Khoury, R. (2019). Subversive toxicity detection using sentiment information. 1–10.

Fauzi, A., Akbar, M. F., & Asmawan, Y. F. A. (2019). Sentimen Analisis Berinternet Pada Media Sosial dengan Menggunakan Algoritma Bayes. Jurnal Informatika, 6(1), 77–83.

Hilman, M., Nurjaman, A., & Mubarok, M. S. (2017). Analisis sentimen pada ulasan buku berbahasa Inggris Menggunakan Information Gain dan Support Vector Machine. E-Proceeding of Engineering: Vol.4, No.3 Desember 2017, 4(3), 4900–4906.

Ibrahim, M., Torki, M., & El-Makky, N. (2019). Imbalanced toxic comments classification using data augmentation and deep learning. Proceedings-17th IEEE International Conference on Machine Learning and Applications, ICMLA 2018, 875–878.

Lu, C., Wang, D., Liu, X., & Gan, K. (2018). A mining and visualizing system for large-scale Chinese technical standards. Proceedings - IEEE 4th International Conference on Big Data Computing Service and Applications, BigDataService 2018, 1–8.

Manning, C. D., Raghavan, P., & Schütze, H. (2009). An Introduction to Information Retrieval. Library Review, 53(9), 462–463.

Maron, M. E., & Kuhns, J. L. (1960). On relevance, probabilistic indexing and information retrieval. Journal of the ACM (JACM), 7(3), 216–244.

Pradikdo, A. C., & Ristyawan, A. (2018). Model klasifikasi abstrak skripsi menggunakan text mining untuk pengkategorian skripsi sesuai bidang kajian. Simetris: Jurnal Teknik Mesin, Elektro Dan Ilmu Komputer, 9(2), 1091–1098.

Ridwansyah, & Aji, S. (2017). Sentimen etika posting pada media sosial menggunakan performa terbaik. Information Management For Educators And Professionals, 2(1), 67–76.

Risch, J., & Krestel, R. (2020). Toxic comment detection in online discussions.

Saeed, H. H., Shahzad, K., & Kamiran, F. (2019). Overlapping toxic sentiment classification using deep neural architectures. IEEE International Conference on Data Mining Workshops, ICDMW, 2018-Novem, 1361–1366.

Sharma, R., & Patel, M. (2018). Toxic comment classification using neural networks and machine learning. IARJSET, 5(9), 47–52.

Sudiantoro, A. V., & Zuliarso, E. (2018). Analisis sentimen twitter menggunakan text mining dengan algoritma Naïve Bayes Classifier. Prosiding SINTAK 2018, 398–401.

Suryani, N. P. S. M., Linawati, & Saputra, K. O. (2019). Penggunaan metode Naive Bayes Classifier pada analisis sentimen Facebook berbahasa Indonesia. Majalah Ilmiah Teknologi Elektro, 18(1), 145-148.