Klasifikasi Emosi Berdasarkan Suara dengan Metode Convolutional Neural Network
DOI:
https://doi.org/10.32493/informatika.v9i4.45236Keywords:
CNN, SER, RAVDESSAbstract
Voice-based emotion detection technology (SER), is the study of machines' ability to comprehend patterns in voice data, utilizing a range of methods and features. However, its utilizations remains limited due to the inherent challenges faced by machines in accurately discerning emotions. This research was conducted using a frequently used method, namely CNN and was developed to produce a high-accuracy method, with spectrogram features due to their capacity to record frequencies in RAVDESS. The data set comprised 2068 voice samples classified into five emotion classes: angry, afraid, happy, sad, and neutral. The augmentation of all data regarding noise, pitch, shifting, stretching, and high and low speed, was implemented to replicate real-world conditions. This research was conducted by training on several parameters such as: learning rate, dropout rate, kernel, weight decay size, optimization, epochs, and batch size. This research resulted in a CNN method with the best parameter values produced {weight_decay': 1e-07, 'optimizer': 'adamw', 'learning_rate': 0.001, 'kernel_initializer': 'he_normal', 'dropout_rate': 0.5, 'epochs': 100, 'batch_size': 48}, which has score value of 0.7448840381991815. The model demonstrated a general accuracy level of 75.85% for the training data and 51.64% for the test data, indicating its ability to recognize existing patterns but difficulty in generalizing new data. However, the ROC curve values indicate that the model is capable of differentiating voice data into its respective classes, with values of 0.84 for angry emotions, 0.79 for fear emotions, 0.83 for happy emotions, 0.80 for sad emotions, and 0.9 for neutral emotions.
References
Aini, Y. K., Santoso, T. B., & Dutono, T. (2021). Pemodelan CNN Untuk Deteksi Emosi Berbasis Speech Bahasa Indonesia. Jurnal Komputer Terapan, 7(1), 143–152. https://doi.org/10.35143/jkt.v7i1.4623
Alluhaidan, A. S., Saidani, O., Jahangir, R., Nauman, M. A., & Neffati, O. S. (2023). Speech Emotion Recognition through Hybrid Features and Convolutional Neural Network. Applied Sciences (Switzerland), 13(8). https://doi.org/10.3390/app13084750
George, S. M., & Muhamed Ilyas, P. (2024). A review on speech emotion recognition: A survey, recent advances, challenges, and the influence of noise. Neurocomputing, 568, 127015. https://doi.org/10.1016/J.NEUCOM.2023.127015
Juslin, P., & Scherer, K. (2008). Speech emotion analysis. Scholarpedia, 3(10). https://doi.org/10.4249/scholarpedia.4240
Khan, A., Sohail, A., Zahoora, U., & Qureshi, A. S. (2020). A survey of the recent architectures of deep convolutional neural networks. Artificial Intelligence Review, 53(8). https://doi.org/10.1007/s10462-020-09825-6
Rahmadani, S., Rahayu, C. S., Salim, A., & Cahyo, K. N. (2022). DETEKSI EMOSI BERDASARKAN WICARA MENGGUNAKAN DEEP LEARNING MODEL. Jurnal Informatika Teknologi Dan Sains (Jinteks), 4(3), 220–224. https://doi.org/10.51401/JINTEKS.V4I3.1952
Tanudjaja, F. J., Puspaningrum, E. Y., & Via, Y. V. (2023). Klasifikasi Jenis Emosi Melalui Ucapan Menggunakan Metode Convolutional Neural Network : Klasifikasi Jenis Emosi Melalui Ucapan. Teknologi: Jurnal Ilmiah Sistem Informasi, 13(2), 1–11. https://doi.org/10.26594/TEKNOLOGI.V13I2.3740
Downloads
Published
Issue
Section
License
Copyright (c) 2024 Muhammad Elio Phillo Rismanto, Irma Handayani

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).
Jurnal Informatika Universitas Pamulang have CC-BY-NC or an equivalent license as the optimal license for the publication, distribution, use, and reuse of scholarly work.
In developing strategy and setting priorities, Jurnal Informatika Universitas Pamulang recognize that free access is better than priced access, libre access is better than free access, and libre under CC-BY-NC or the equivalent is better than libre under more restrictive open licenses. We should achieve what we can when we can. We should not delay achieving free in order to achieve libre, and we should not stop with free when we can achieve libre.
Jurnal Informatika Universitas Pamulang is licensed under a Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)
YOU ARE FREE TO:
- Share : copy and redistribute the material in any medium or format
- Adapt : remix, transform, and build upon the material for any purpose, even commercially.
- The licensor cannot revoke these freedoms as long as you follow the license terms