Pengaruh Stemming Nazief & Adriani terhadap Performa Algoritma Rabin-Karp dalam Mendeteksi Kemiripan Teks
DOI:
https://doi.org/10.32493/informatika.v6i4.16074Keywords:
Analyse, Effect, Rabin-Karp, Similarity, stemmingAbstract
One of the information retrieval methods which be able to search root word of each word in document is stemming. Stemming process is done by eliminate prefixes, infixes, suffixes or confixes. Vega, Tala, Arifin and Setiono, Nazief and Adriani and Tala stemming are kind of Indonesian Language stemming. The method that is able to trace each character in sequence character is fingerprinting. Rabin-Karp algorithm is one of the fingerprinting method algorithms. This algorithm implement has function to process matching text/string, so it is really suitable to implement of text/string similarity detection. Researcher will analyze the effect of Nazief and Adriani stemming method to algorithm of Rabin-Karp performance to identify similarity of text/string. The researcher implemented datasets such as titles, keywords, introductions or abstracts from The Pamulang Informatics Engineering Journal which we had changed the wording. The result of the experiment data which has changed word order randomly that used stemming method has decreased 0.76% than without implemented stemming method. Furthermore, the experiment data which has been changed sentence order randomly has decreased 0.04% too.
References
Bhosale, M. V., & Vankudre, A. A. (2017). Detection of Real-Time Traffic through Twitter Stream Analysis. International Research Journal of Advanced Engineering and Science, 2(2), 124–126.
Christina, S., Oktaviyani, E. D., & Famungkas, B. (2018). Mendeteksi Plagiarism Pada Dokumen Proposal Skripsi Menggunakan Algoritma Jaro Winkler Distance. Jurnal Saintekom, 8(2), 143–153. https://doi.org/https://doi.org/10.33020/saintekom.v8i2.68
Hapsari, R. K., & Santoso, Y. J. (2015). Stemming Artikel Berbahasa Indonesia Dengan Pendekatan Confix-Stripping. Prosiding Seminar Nasional Manajemen Teknologi XXII, 1–8.
Hidayatullah, A. F. (2015). The Influence of Stemming on Indonesian Tweet Sentiment Analysis. Proceeding of the Electrical Engineering Computer Science and Informatics, Vol 2, 127–132. https://doi.org/http://dx.doi.org/10.11591/eecsi.v2.791
KBBI. (2016). http://kbbi.web.id/jiplak
Mardiana, T., Adji, T. B., & Hidayah, I. (2016). Stemming Influence on Similarity Detection of Abstract Written in Indonesia. TELKOMNIKA (Telecommunication Computing Electronics and Control), 14(1), 219–227. https://doi.org/http://dx.doi.org/10.12928/telkomnika.v14i1.1926
Nugroho, H. T. (2017). Pengaruh Algoritma Stemming Nazief-Adriani Terhadap Kinerja Algoritma Winnowing Untuk Mendeteksi Plagiarisme Bahasa Indonesia. Ultima Computing : Jurnal Sistem Komputer, 9(1), 36–40. https://doi.org/https://doi.org/10.31937/sk.v9i1.572
Prihatini, P. M., Putra, I. D., Giriantari, I., & Sudarma, M. (2017). Stemming Algorithm for Indonesian Digital News Text Processing. International Journal of Engineering and Emerging Technology, 2(2), 1–7.
Purba, A. H., & Situmorang, Z. (2017). Analisis Perbandingan Algoritma Rabin-Karp Dan Levenshtein Distance Dalam Menghitung Kemiripan Teks. Jurnal Teknik Informatika UNIKA Santo Thomas, 2(2), 24–32. https://doi.org/https://doi.org/10.17605/jti.v2i2.187
Putra, D. A., & Sujaini, H. (2016). Implementasi Algoritma Rabin-Karp untuk Membantu Pendeteksian Plagiat pada Karya Ilmiah. JUSTIN (Jurnal Sistem Dan Teknologi Informasi), 4(1), 66–74.
Rahimi, M., & Zahedi, M. (2014). Query expansion based on relevance feedback and latent semantic analysis. Journal of AI and Data Mining, 2(1), 79–84. https://doi.org/https://dx.doi.org/10.22044/jadm.2014.188
Rahmaddeni, Sazali, D., & Agustin. (2018). Sistem Pendeteksi Tingkat Kesamaan Teks pada Pengusulan Proposal Penelitian Internal Menggunakan Algoritma Rabin-Karp. SATIN - Sains Dan Teknologi Informasi, 4(2), 84–92. https://doi.org/https://doi.org/10.33372/stn.v4i2.415
Ruban, S. S., Serrao, S. B., & Harshitha, L. V. (2015). A Study and Analysis of Information Retrieval Models. International Journal of Innovative Research in Computer and Communication Engineering, 3(7), 230–236.
Simarangkir, M. S. H. (2017). Studi Perbandingan Algoritma - Algoritma Stemming untuk Dokumen Teks Bahasa Indonesia. Jurnal Inkofar, 1(1), 40–46.
Tala, F. Z. (2004). A Study of Stemming Effects on Information Retrieval in Bahasa Indonesia. Universiteit van Amsterdam The Netherlands.
Verdaningroem, N. J. M., & Saifudin, A. (2018). Penerapan Kamus Dasar pada Algoritma Porter untuk Mengurangi Kesalahan Stemming Bahasa Indonesia. Jurnal Teknologi, 10(2), 103–112. https://doi.org/https://doi.org/10.24853/jurtek.10.2.103-112
Wicaksono, Y. A., & Suyanto. (2012). Analisis dan Implementasi Algoritma Rabin-Karp dan Algoritma Stemming Nazief-Adriani pada Sistem Pendeteksi Plagiat Dokumen Teks Berbahasa Indonesia. Universitas Telkom.
Yulianingsih. (2017). Implementasi Algoritma Jaro-Winkler dan Levenstein Distance dalam Pencarian Data pada Database. STRING (Satuan Tulisan Riset Dan Inovasi Teknologi), 2(1), 18–27. https://doi.org/https://doi.org/10.30998/string.v2i1.1720
Yulianto, M. A., & Nurhasanah, N. (2021). The Hybrid of Jaro-Winkler and Rabin-Karp Algorithm in Detecting Indonesian Text Similarity. Jurnal Online Informatika, 6(1), 88. https://doi.org/10.15575/join.v6i1.640
Downloads
Published
Issue
Section
License
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).
Jurnal Informatika Universitas Pamulang have CC-BY-NC or an equivalent license as the optimal license for the publication, distribution, use, and reuse of scholarly work.
In developing strategy and setting priorities, Jurnal Informatika Universitas Pamulang recognize that free access is better than priced access, libre access is better than free access, and libre under CC-BY-NC or the equivalent is better than libre under more restrictive open licenses. We should achieve what we can when we can. We should not delay achieving free in order to achieve libre, and we should not stop with free when we can achieve libre.
Jurnal Informatika Universitas Pamulang is licensed under a Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)
YOU ARE FREE TO:
- Share : copy and redistribute the material in any medium or format
- Adapt : remix, transform, and build upon the material for any purpose, even commercially.
- The licensor cannot revoke these freedoms as long as you follow the license terms