Analisis Sentimen Terhadap Chatgpt Dan Gemini Dengan Algoritma K-Nearest Neighbor, Decision Tree Dan Naïve Bayes

Authors

  • Dika Prasetya Teknik Informatika S-2, Program Pascasarjana, Universitas Pamulang, Kota Tangerang Selatan, Banten

Keywords:

ChatGPT, Gemini, sentiment analysis, TF-IDF, Naïve Bayes, KNN, Decision Tree

Abstract

The rapid development of artificial intelligence technology has increased the widespread use of AI-based chatbots such as ChatGPT and Gemini. The extensive adoption of these technologies has generated diverse public opinions, which are frequently expressed through social media platforms, particularly X (Twitter). This study aims to analyze public sentiment toward ChatGPT and Gemini and to compare the performance of three classification algorithms, namely Naïve Bayes, K-Nearest Neighbor (KNN), and Decision Tree, in sentiment classification tasks. This research employs a quantitative approach using text mining techniques. The dataset consists of tweets collected through a crawling process using the Python programming language, based on keywords related to ChatGPT and Gemini. Data preprocessing includes data cleansing, case folding, tokenization, stopword removal, and stemming. Sentiment labels, categorized into positive, neutral, and negative classes, are assigned using the VADER lexicon-based approach. Text data are then transformed into numerical features using the Term Frequency–Inverse Document Frequency (TF-IDF) method. The dataset is divided into training and testing sets for model development and evaluation.The experimental results indicate that the Naïve Bayes algorithm outperforms the other models, achieving an accuracy of 57.26%, followed by Decision Tree with 54.98%, and KNN with 41.59%. Further evaluation using precision, recall, and F1-score metrics confirms that Naïve Bayes provides more stable performance in handling high-dimensional text data. These findings suggest that Naïve Bayes is the most effective algorithm for sentiment analysis of short text data on social media platforms

References

[1] J. Hutto and E. Gilbert, “VADER: A Parsimonious Rule-Based Model for Sentiment Analysis of Social Media Text,” in Proc. Int. AAAI Conf. Web and Social Media (ICWSM), 2014.

[2] G. Salton and C. Buckley, “Term-Weighting Approaches in Automatic Text Retrieval,” Inf. Process. Manage., vol. 24, no. 5, pp. 513–523, 1988.

[3] F. Pedregosa et al., “Scikit-learn: Machine Learning in Python,” J. Mach. Learn. Res., vol. 12, pp. 2825–2830, 2011.

[4] C. D. Manning, P. Raghavan, and H. Schütze, Introduction to Information Retrieval. Cambridge, U.K.: Cambridge Univ. Press, 2008.

[5] B. Liu, Sentiment Analysis and Opinion Mining. San Rafael, CA, USA: Morgan & Claypool, 2012.

[6] S. Bird, E. Klein, and E. Loper, Natural Language Processing with Python. Sebastopol, CA, USA: O’Reilly Media, 2009.

[7] A. McCallum and K. Nigam, “A Comparison of Event Models for Naïve Bayes Text Classification,” in AAAI Workshop on Learning for Text Categorization, 1998.

[8] L. Breiman, J. Friedman, R. Olshen, and C. Stone, Classification and Regression Trees. Belmont, CA, USA: Wadsworth, 1984.

[9] T. M. Cover and P. E. Hart, “Nearest Neighbor Pattern Classification,” IEEE Trans. Inf. Theory, vol. 13, no. 1, pp. 21–27, 1967

Downloads

Published

2026-01-31