Penerapan Model T5-Small untuk Abstractive Text Summarization pada Berita Olahraga
Keywords:
Abstractive Text Summarization, Transformers, T5, Indonesian Sports NewsAbstract
Sports news is characterized by its length and dense information, so readers often have difficulty quickly obtaining the main information. Manual summary creation is inefficient, while research on automatic summary systems in Indonesian, especially in the sports domain, is still very limited. This study develops an abstractive text summarization model based on the Transformer architecture (T5-Small) to generate summaries of Indonesian sports news. The dataset was obtained from Kaggle and then went through a pre-processing stage including data cleaning, text normalization, tokenization using T5Tokenizer, and the application of padding and truncation to match the model's input format. The model was trained using a data split of 80% for training, 10% for validation, and 10% for testing. Performance evaluation was conducted using the ROUGE-1, ROUGE-2, and ROUGE-L metrics by comparing the model summary against the reference summary (gold standard). The evaluation results using the ROUGE metric indicate that the model has quite good performance in producing relevant summaries. The ROUGE-1 value of 0.6011 indicates that more than half of the unigrams in the model summary match the reference summary. The ROUGE-2 value of 0.3940 indicates the model's ability to capture relationships between words, or bigrams, with a near 40% agreement rate. Meanwhile, the ROUGE-L value of 0.5411 confirms that the model's sentence sequence structure aligns with the original summary. Overall, these three values confirm the model's ability to produce informative and consistent summaries.
References
[1] M. S. Utomo, J. S. Wibowo, and E. N. Wahyudi, “Text Summarization Pada Artikel Berita Menggunakan Vector Space Model Dan Cosine Similarity,” J. Din. Inform., vol. 14, no. 1, pp. 11–24, 2022, doi: 10.35315/informatika.v14i1.9163.
[2] M. A. Zamzam, “Sistem Automatic Text Summarization Menggunakan Algoritma Textrank,” Matics, vol. 12, no. 2, pp. 111–116, 2020, doi: 10.18860/mat.v12i2.8372.
[3] R. J. Ong et al., “Text Summarization Dengan Menggunakan Bert Dengan Data Berita Indonesia,” pp. 4–7, 2022, [Online]. Available: https://huggingface.co/datasets/id_liputan6
[4] I. N. Purnama and N. N. Widya Utami, “Implementasi Peringkas Dokumen Berbahasa Indonesia Menggunakan Metode Text To Text Transfer Transformer (T5),” J. Teknol. Inf. dan Komput., vol. 9, p. 4, 2023.
[5] A. Bahari and K. E. Dewi, “Peringkasan Teks Otomatis Abstraktif Menggunakan Transformer Pada Teks Bahasa Indonesia,” Komputa J. Ilm. Komput. dan Inform., vol. 13, no. 1, pp. 83–91, 2024, doi: 10.34010/komputa.v13i1.11197.
[6] G. Hartawan, D. S. Maylawati, and W. Uriawan, “Bidirectional and Auto-Regressive Transformer (BART) for Indonesian Abstractive Text Summarization,” J. Inform. Polinema, vol. 10, no. 4, pp. 535–542, 2024, doi: 10.33795/jip.v10i4.5242.
[7] H. Shakil, A. Farooq, and J. Kalita, “Neurocomputing Abstractive text summarization : State of the art , challenges , and improvements,” Neurocomputing, vol. 603, no. July, p. 128255, 2024, doi: 10.1016/j.neucom.2024.128255.
[8] Q. A. Itsnaini, M. Hayaty, A. D. Putra, and N. A. . Jabari, “Abstractive Text Summarization using Pre-Trained Language Model ‘Text-to-Text Transfer Transformer (T5),’” Ilk. J. Ilm., vol. 15, no. 1, pp. 124–131, 2023, doi: 10.33096/ilkom.v15i1.1532.124-131.
[9] M. D. B. Laksana, A. E. Karyawati, L. A. A. R. Putri, I. W. Santiyasa, N. A. Sanjaya ER, and I. G. A. G. A. Kadnyanan, “Text Summarization terhadap Berita Bahasa Indonesia menggunakan Dual Encoding,” JELIKU (Jurnal Elektron. Ilmu Komput. Udayana), vol. 11, no. 2, p. 339, 2022, doi: 10.24843/jlk.2022.v11.i02.p13.
[10] Halimah, Surya Agustian, and Siti Ramadhani, “Peringkasan teks otomatis (automated text summarization) pada artikel berbahasa indonesia menggunakan algoritma lexrank,” J. CoSciTech (Computer Sci. Inf. Technol., vol. 3, no. 3, pp. 371–381, 2022, doi: 10.37859/coscitech.v3i3.4300.
[11] J. Pragantha, T. Informatika, F. T. Informasi, and U. Tarumanagara, “Automatic Summarization Pada,” vol. 1, no. 1, pp. 71–78, 2017.
[12] K. E. Dewi and N. I. Widiastuti, “The Design of Automatic Summarization of Indonesian Texts Using a Hybrid Approach,” J. Teknol. Inf. dan Pendidik., vol. 15, no. 1, pp. 37–43, 2022, doi: 10.24036/jtip.v15i1.451.
[13] J. U. S. Lazuardi and A. Juarna, “Analisis Sentimen Ulasan Pengguna Aplikasi Joox Pada Android Menggunakan Metode Bidirectional Encoder Representation From Transformer (Bert),” J. Ilm. Inform. Komput., vol. 28, no. 3, pp. 251–260, 2023, doi: 10.35760/ik.2023.v28i3.10090.
[14] K. Tonchar, S. Raut, M. Peshwe, and S. Rathod, “‘ Text-Summarization using Transformer based models ,’” vol. 9, no. 4, 2024.
[15] Y. Yuliska and K. U. Syaliman, “Peringkasan Dokumen Teks Otomatis Berdasarkan Sebuah Kueri Menggunakan Bidirectional Long Short Term Memory Network,” INTECOMS J. Inf. Technol. Comput. Sci., vol. 5, no. 2, pp. 65–71, 2022, doi: 10.31539/intecoms.v5i2.4729.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Dandy Prasetyo Ramadhan, Arya Adyhaksa Waskita

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
