Penerapan Arsitektur Kappa dengan Kafka dan Spark untuk Pemrosesan Data Hipertensi di Media Sosial X
DOI:
https://doi.org/10.32493/jtsi.v9i1.57819Keywords:
Kappa Architecture, Apache Kafka, Apache Spark, Hipertensi, XAbstract
Hypertension is one of the major public health problems with a continuously increasing prevalence and is widely discussed on social media platform X. The dynamic and continuously flowing nature of social media data requires a Big Data-based processing approach capable of operating in real-time and in a scalable manner. This study aims to implement a streaming-based Big Data architecture (Kappa Architecture) using Apache Kafka and Apache Spark to process and analyze conversations about hypertension on the social media platform X in real-time. The proposed system integrates the X API as the data source, Apache Kafka as the immutable event log and streaming backbone, Apache Spark Structured Streaming as the real-time data processing engine, and MongoDB as the serving layer. The research methodology includes a literature review, system design, streaming-based data collection, real-time text cleaning and feature extraction, and performance evaluation using throughput, latency, and success rate parameters. A total of 10,000 tweets were collected over a two-month period and processed through a unified streaming pipeline. The implementation results show that the system successfully established a consistent end-to-end processing workflow, enabling real-time data ingestion and processing without separating batch and speed layers. The system achieved an average throughput of 19.23 tweets per second, a latency of approximately 520 seconds, and a success rate of 100%. This study concludes that the Kappa Architecture is effective, stable, and scalable for real-time processing and analysis of social media data in monitoring public health issues such as hypertension.
References
Abirami T and Dr. Chandrasekar B S. (2024). Kappa and Lambda Architectures for Telecom Big Data Pipelines. International Journal of Research Publication and Reviews, 5(9), 739–743. https://doi.org/10.13140/RG.2.2.16197.56803
Anom, H., Aji, S., & Prasetyo, A. C. (2024). Evaluasi Kinerja Jaringan WiFi Mahasiswa : Analisis Throughput , Delay , Jitter , dan Packet loss. Jurnal BATIRSI, 8(1), 23–27.
Bertin, J., Penka, N., & Debauche, O. (2022). An Optimized Kappa Architecture for IoT Data Management in Smart. Journal of Ubiquitous Systems & Pervasive Networks, 17(2), 59–65. https://doi.org/10.5383/JUSPN.17.02.002
Gede, K., Gede, I. P., Suputra, H., & Gde, I. A. (2024). Pengolahan Big Data Dengan Sharding Database Dan Kappa Architecture Untuk Data Time-Series. Jurnal Elektronik Ilmu Komputer Udayana, 13(1), 43–54. https://doi.org/https://doi.org/10.24843/JLK.2024.v13.i01.p05
Hilmy Farid, Dadang Yusup, C. (2022). Analisis Usability Pada Aplikasi Momby Spa Menggunakan Metode Usability Testing. Jurnal Ilmiah Wahana Pendidikan, 8(14), 155–163. https://doi.org/https://doi.org/10.5281/zenodo.6982246
Maulana, M. R., Fazilatunnisa, A., Febriansyah, M. Y., Muiz, A., & Fauzan, I. (2026). Analisis dan Prediksi Curah Hujan Bulanan Kota Serang Berbasis Apache Spark Menggunakan Dataset BPS Provinsi Banten. Jurnal Ilmu Komputer Dan Teknik Informatika, 2(1), 15–21. https://doi.org/https://doi.org/10.64803/juikti.v2i1.78
Mikola, A., Sari, M., Informasi, T., Informasi, S., Informasi, F. T., Kristen, U., & Wacana, S. (2022). Analisis Sistem Jaringan Berbasis QoS untuk Hot-Spot Di Institut Shanti Bhuana. JIFOTECH (JOURNAL OF INFORMATION TECHNOLOGY), 2(1), 2–6. https://doi.org/10.46229/jifotech.v2i1.398
Mita Permatasari, T. H. (2025). Implementation Of The K-Nearest Neighbor Algorithm For Low Sodium Food. Jurnal Sistem Informasi DanTeknologi Informasi, 7(3), 867–879. https://doi.org/https://doi.org/10.52005/jursistekni.v7i3.505
Musababa, M. A., Fachrie, M., & Yogyakarta, U. T. (2025). Data Streaming Pipeline Model Using DBSTREAM-Based Online Machine Learning for E-Commerce User Segmentation. Journal of Applied Informatics and Computing (JAIC), 9(6), 3346–3355. https://doi.org/https://doi.org/10.30871/jaic.v9i6.11522
Nabawi, F. (2022). Jurnal Implementasi Sistem Distribusi Pesan dan Proses Data Secara Real Time dengan Apache Kafka. Jurnal Teknologi Informatika Dan Komputer, 8(1), 173–189. https://doi.org/10.37012/jtik.v8i1.836
Nursinggah, L., Mufizar, T., & Perjuangan, U. (2024). Analisis Sentimen Pengguna Aplikasi X Terhadap Program Makan Siang Gratis. JITET (Jurnal Informatika Dan Teknik Elektro Terapan), 12(3). https://doi.org/https://doi.org/10.23960/jitet.v12i3.4336
Park, S., & Huh, J. H. (2023). A Study on Big Data Collecting and Utilizing Smart Factory Based Grid Networking Big Data Using Apache Kafka. IEEE Access, 11(September), 96131–96142. https://doi.org/10.1109/ACCESS.2023.3305586
Parmar, T. (2025). Data Architectures and Methods for Fast Track Data Processing Using Hot and Cold Paths. SSRN Electronic Journal, February. https://doi.org/10.2139/ssrn.5190568
Pradinata, A., Lestari Lokapitasari B, P., & Azis, D. H. (2023). Perancangan Aplikasi E-ticketing Dengan Model Arsitektur Microservice Menggunakan Kafka. Buletin Sistem Informasi Dan Teknologi Islam, 4(3), 286–295. https://doi.org/https://doi.org/10.33096/busiti.v4i3.1806
Puthenpariyarath, S. (2025). REAL-TIME DATA PROCESSING WITH KAFKA VS . PUB / SUB. International Journal of Data Analytics (IJDA), 5(1), 1–12. https://doi.org/https://doi.org/10.34218/IJDA_05_01_001
Sabrina, D., Iqbal, M., & Suri, N. (2026). Komponen Biaya yang Mempengaruhi Total Cost of Illness pada Pasien Hipertensi Rawat Inap: Narrative Review. Sains Medisina, 4(3), 218–223. https://doi.org/10.63004/snsmed.v4i3.902
Studies, M., & Guntupalli, B. (2023). ETL Architecture Patterns: Hub-and-Spoke, Lambda, and More. International Journal of AI, BigData, Computational and Management Studies, 4(3), 61–71. https://doi.org/10.63282/3050-9416.ijaibdcms-v4i3p107
Tri Buana, D. M. (2022). Penggunaan aplikasi tik tok (versi terbaru) dan kreativitas anak. Jurnal Inovasi, 16(12), 34–44. https://doi.org/https://doi.org/10.33557/ji.v16i2.2227
Vaghani Divyeshkumar. (2024). Hybrid Data Processing Approaches: Combining Batch and Real Time Processing with Spark. SSRN 49533. https://doi.org/https://doi.org/10.2139/ssrn.4953336
Zhou, Z., & Zhou, L. (2024). applied sciences A Distributed Real-Time Monitoring Scheme for Air Pressure Stream Data Based on Kafka. 14(12), 4967. https://doi.org/https://doi.org/10.3390/app14124967
Zulkifli, R. (2025). Analisis Sentimen Real-Time Media Sosial Menggunakan Edge Computing dan Apache Kafka. Bit-Tech, 7(3), 1106–1117. https://doi.org/10.32877/bt.v7i3.2372
Downloads
Published
How to Cite
Issue
Section
Categories
License
Copyright (c) 2026 Legawan Perkasa, Tikardiha Hardiani

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).
Jurnal Teknologi Sistem Informasi dan Aplikasi have CC BY-NC or an equivalent license as the optimal license for the publication, distribution, use, and reuse of scholarly work.
In developing strategy and setting priorities, Jurnal Teknologi Sistem Informasi dan Aplikasi recognize that free access is better than priced access, libre access is better than free access, and libre under CC BY-NC or the equivalent is better than libre under more restrictive open licenses. We should achieve what we can when we can. We should not delay achieving free in order to achieve libre, and we should not stop with free when we can achieve libre.
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) License
YOU ARE FREE TO:
- Share - copy and redistribute the material in any medium or format
- Adapt - remix, transform, and build upon the material for any purpose, even commercially.
- The licensor cannot revoke these freedoms as long as you follow the license terms








