Sentimen Analisis Komentar Toxic pada Grup Facebook Game Online Menggunakan Klasifikasi Naïve Bayes
DOI:
https://doi.org/10.32493/informatika.v5i3.6571Keywords:
Toxic comments, TF-IDF, Information Gain, Sentiment Analysis, Naive BayesAbstract
Toxic comments are comments made by social media users that contain expressions of hatred, condescension, threatening, and insulting. Social media users who are on average still teenagers with a nature that still cannot be controlled completely becomes a matter of great concern when they comment, their comments can be studied as text processing. Sentiment analysis can be used as a solution to identifying toxic comments by dividing them into two classifications. Where the data used amounted to 1,500 taken from social media Facebook in the private group Arena of Valor community. The dataset is divided into 2 classes: toxic and non-toxic. This research uses Naive Bayes with TF-IDF transformation and Information Gain feature selection and use distribution ratio 80:20. It will be compared the results of the evaluation where Naive Bayes without transformation, using TF-IDF transformation, and TF-IDF using Information Gain feature selection. The results of the comparison of evaluations from confusion matrix that have been carried out obtained the best classification model is to use the ratio of training and testing data 80:20 with TF-IDF transformation resulting in an accuracy of 75%, precision of 63%, recall of 67%, and F-measure of 64%.References
Brassard-Gourdeau, E., & Khoury, R. (2019). Subversive toxicity detection using sentiment information. 1–10. https://doi.org/10.18653/v1/w19-3501
Fauzi, A., Akbar, M. F., & Asmawan, Y. F. A. (2019). Sentimen Analisis Berinternet Pada Media Sosial dengan Menggunakan Algoritma Bayes. Jurnal Informatika, 6(1), 77–83. https://doi.org/10.31311/ji.v6i1.5437
Hilman, M., Nurjaman, A., & Mubarok, M. S. (2017). Analisis sentimen pada ulasan buku berbahasa Inggris Menggunakan Information Gain dan Support Vector Machine. E-Proceeding of Engineering: Vol.4, No.3 Desember 2017, 4(3), 4900–4906.
Ibrahim, M., Torki, M., & El-Makky, N. (2019). Imbalanced toxic comments classification using data augmentation and deep learning. Proceedings-17th IEEE International Conference on Machine Learning and Applications, ICMLA 2018, 875–878. https://doi.org/10.1109/ICMLA.2018.00141
Lu, C., Wang, D., Liu, X., & Gan, K. (2018). A mining and visualizing system for large-scale Chinese technical standards. Proceedings - IEEE 4th International Conference on Big Data Computing Service and Applications, BigDataService 2018, 1–8. https://doi.org/10.1109/BigDataService.2018.00010
Manning, C. D., Raghavan, P., & Schütze, H. (2009). An Introduction to Information Retrieval. Library Review, 53(9), 462–463. https://doi.org/10.1108/00242530410565256
Maron, M. E., & Kuhns, J. L. (1960). On relevance, probabilistic indexing and information retrieval. Journal of the ACM (JACM), 7(3), 216–244. https://doi.org/10.1145/321033.321035
Pradikdo, A. C., & Ristyawan, A. (2018). Model klasifikasi abstrak skripsi menggunakan text mining untuk pengkategorian skripsi sesuai bidang kajian. Simetris: Jurnal Teknik Mesin, Elektro Dan Ilmu Komputer, 9(2), 1091–1098.
Ridwansyah, & Aji, S. (2017). Sentimen etika posting pada media sosial menggunakan performa terbaik. Information Management For Educators And Professionals, 2(1), 67–76.
Risch, J., & Krestel, R. (2020). Toxic comment detection in online discussions. https://doi.org/10.1007/978-981-15-1216-2_4
Saeed, H. H., Shahzad, K., & Kamiran, F. (2019). Overlapping toxic sentiment classification using deep neural architectures. IEEE International Conference on Data Mining Workshops, ICDMW, 2018-Novem, 1361–1366. https://doi.org/10.1109/ICDMW.2018.00193
Sharma, R., & Patel, M. (2018). Toxic comment classification using neural networks and machine learning. IARJSET, 5(9), 47–52. https://doi.org/10.17148/iarjset.2018.597
Sudiantoro, A. V., & Zuliarso, E. (2018). Analisis sentimen twitter menggunakan text mining dengan algoritma Naïve Bayes Classifier. Prosiding SINTAK 2018, 398–401.
Suryani, N. P. S. M., Linawati, & Saputra, K. O. (2019). Penggunaan metode Naive Bayes Classifier pada analisis sentimen Facebook berbahasa Indonesia. Majalah Ilmiah Teknologi Elektro, 18(1), 145-148. https://doi.org/https://doi.org/10.24843/MITE.2019.v18i01.P22
Downloads
Published
Issue
Section
License
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).
Jurnal Informatika Universitas Pamulang have CC-BY-NC or an equivalent license as the optimal license for the publication, distribution, use, and reuse of scholarly work.
In developing strategy and setting priorities, Jurnal Informatika Universitas Pamulang recognize that free access is better than priced access, libre access is better than free access, and libre under CC-BY-NC or the equivalent is better than libre under more restrictive open licenses. We should achieve what we can when we can. We should not delay achieving free in order to achieve libre, and we should not stop with free when we can achieve libre.
Jurnal Informatika Universitas Pamulang is licensed under a Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)
YOU ARE FREE TO:
- Share : copy and redistribute the material in any medium or format
- Adapt : remix, transform, and build upon the material for any purpose, even commercially.
- The licensor cannot revoke these freedoms as long as you follow the license terms