Analisis Sentiment Tweets Berbahasa Sunda Menggunakan Naive Bayes Classifier dengan Seleksi Feature Chi Squared Statistic
DOI:
https://doi.org/10.32493/informatika.v4i3.3186Keywords:
Sentiment Analysis, Sundanese, Twitter, NaÑ—ve Bayes Classifier (NBC), Chi Squared StatisticAbstract
At present the development of the use of social media in Indonesia is very rapid, in Indonesia there are a variety of regional languages, one of which is the Sundanese language, where some people especially those living in West Java use Sundanese language to express comments, opinions, suggestions, criticisms and others in social media. This information can be used as valuable data for individuals or organizations in decision making. The huge amount of data makes it impossible for humans to read and analyze it manually. Sentiment analysis is the process of classifying opinions, analyzing, understanding, evaluating, emotions and attitudes towards a particular entity such as individuals, organizations, products or services, topics, events, in order to obtain information. The purpose of this research is the Naїve Bayes Classifier (NBC) classification algorithm and Feature Chi Squared Statistics selection method can be used in Sundanese-language tweets sentiment analysis on Twitter social media into positive, negative and neutral categories. Chi Square Statistic feature test results can reduce irrelevant features in the Naïve Bayes Classifier classification process on Sundanese-language tweets with an accuracy of 78.48%.References
Berry, M.W. & Kogan, J. 2010. “Text Mining Aplication and theoryâ€. WILEY : United Kingdom.
Chandani, V., & Wahono, R. S. (2015). “Komparasi Algoritma Klasifikasi Machine Learning Dan Feature Selection pada Analisis Sentimen Review Filmâ€. Journal of Intelligent Systems,1(1), 56-60.
Dehaff, M. 2010. “Sentiment Analysis, Hard But Worth It!â€.
Feldman, R & Sanger, J. 2007. “The Text Mining Handbook : Advanced Approaches in Analyzing Unstructured Dataâ€. Cambridge University Press : New York.
Ginting, H. S., Lhaksmana, K. M., & Murdiansyah, D. T. (2018). “Klasifikasi Sentimen Terhadap Bakal Calon Gubernur Jawa Barat 2018 Di Twitter Menggunakan Naive Bayesâ€. eProceedings of Engineering, 5(1).
Gorunescu, F. 2011. “Data Mining Concepts, Model and Techniquesâ€. Berlin: Springer.
Jenkins, M. C. 2011. “How Sentiment Analysis works in machinesâ€.
Lidya, S. K., Sitompul, O. S., & Efendi, S. (2015). “Sentiment Analysis Pada Teks Bahasa Indonesia Menggunakan Support Vector Machine (SVM) Dan K-Nearest Neighbor (K-NN). InSeminar Nasional Teknologi Informasi dan Komunikasiâ€.
Ling, J., Kencana, I. P. E. N., & Oka, T. B. (2014). “Analisis Sentimen Menggunakan Metode Naïve Bayes Classifier Dengan Seleksi Fitur Chi Squareâ€. E-Jurnal Matematika, 3(3), 92-99.
Putranti, N. D., & Winarko, E. (2014). “Analisis sentimen twitter untuk teks berbahasa Indonesia dengan maximum entropy dan support vector machineâ€. IJCCS (Indonesian Journal of Computing and Cybernetics Systems), 8(1), 91-100.
Routray, P., Swain, C. K. & Mishra, S.P., 2013. “A Survey on Sentiment Analysis. International Journal of Computer Applicationsâ€, Agustus, 70(10), pp. 1-8
Saputra, N., Adji, T. B., & Permanasari, A. E. (2015). “Analisis sentimen data presiden Jokowi dengan preprocessing normalisasi dan stemming menggunakan metode naive bayes dan SVMâ€. Jurnal Dinamika Informatika, 5(1).
Wulandini, F. & Nugroho, A. N. 2009. “Text Classification Using Support Vector Machine for Webmining Based Spation Temporal Analysis of the Spread of Tropical Diseasesâ€. International Conference on Rural Information and Communication Technology 2009.
Yang, Y., & Pedersen, J. O. 1997. “A comparative study on feature selection in text categorizationâ€. ICML, (hal. 412--420).
Downloads
Published
Issue
Section
License
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).
Jurnal Informatika Universitas Pamulang have CC-BY-NC or an equivalent license as the optimal license for the publication, distribution, use, and reuse of scholarly work.
In developing strategy and setting priorities, Jurnal Informatika Universitas Pamulang recognize that free access is better than priced access, libre access is better than free access, and libre under CC-BY-NC or the equivalent is better than libre under more restrictive open licenses. We should achieve what we can when we can. We should not delay achieving free in order to achieve libre, and we should not stop with free when we can achieve libre.
Jurnal Informatika Universitas Pamulang is licensed under a Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)
YOU ARE FREE TO:
- Share : copy and redistribute the material in any medium or format
- Adapt : remix, transform, and build upon the material for any purpose, even commercially.
- The licensor cannot revoke these freedoms as long as you follow the license terms