Comparative Study on Regression Algorithms for Predicting Price of Online Course: Udemy Case Study
DOI:
https://doi.org/10.32493/informatika.v8i2.30562Keywords:
Comparative Study, Machine Learning, Price Prediction, RegressionAbstract
Talent in the field of information technology is much needed. However, studying in the field of information technology requires a sizable fee. Online courses are a cost-effective option for learning. Online course sites like Udemy provide and sell hundreds of thousands of courses and have thousands of trusted instructors. With so many Udemy instructors, prices vary widely because the course pricing system is completely set by the teaching instructor. This means that the selling price of the course is not affected by the quality of the course, so not all courses are recommended to be purchased. To overcome this problem, a system is needed that can predict course prices so that it can advise instructors in determining selling prices. To compare the best algorithms used to create this system, three algorithms are used in this study: multiple linear regression, polynomial regression, and K-Nearest Neighbors Regression. The researcher uses 1200 data sample from web scraping results from the Udemy site, with one test for each algorithm. As a result, the K-Nearest Neighbors Regression got the best evaluation results with a root mean squared error value of 231659.49, a mean absolute percentage error of 0.43, and a coefficient of determination of 0.18.
References
Anscombe, F. J. (1973). Graphs in statistical analysis. The american statistician, 27(1), 17-21.
Atzzahra, H. (2021). ANALISIS SENSITIVITAS PENGARUH KEBIJAKAN PEMERINTAH DAN PENERAPAN POLYNOMIAL REGRESSION PADA MODEL TRANSMISI COVID-19 (Doctoral dissertation, Institut Teknologi Kalimantan).
Behera, J., Pasayat, A. K., Behera, H., & Kumar, P. (2023). Prediction based mean-value-at-risk portfolio optimization using machine learning regression algorithms for multi-national stock markets. Engineering Applications of Artificial Intelligence, 120, 105843.
Botchkarev, A. (2018). Evaluating performance of regression machine learning models using multiple error metrics in azure machine learning studio. Available at SSRN 3177507.
Brownlee, J. (2020). Data preparation for machine learning: data cleaning, feature selection, and data transforms in Python. Machine Learning Mastery.
Chen, W., Zhang, H., Mehlawat, M. K., & Jia, L. (2021). Mean–variance portfolio optimization using machine learning-based stock price prediction. Applied Soft Computing, 100, 106943.
Fafirudin, T., Fitriani, F., & Wulandari, A. (2021). Minat Mahasiswa Melanjutkan Kuliah: Intensitas Promosi, Kepercayaan dan Biaya Kuliah. Jurnal Pengembangan Wiraswasta, 23(3), 185-192.
Fauzia, F., Virantika, A., & Firmansyah, G. (2021). Langkah langkah Strategis Pemenuhan Kebutuhan SDM Talenta Digital di Lingkungan Pemerintahan Indonesia. Proceeding KONIK (Konferensi Nasional Ilmu Komputer), 5, 39-46.
Ginantra, N. L. W. S. R., & Anandita, I. B. G. (2019). Penerapan Metode Single Exponential Smoothing Dalam Peramalan Penjualan Barang. J-SAKTI (Jurnal Sains Komputer dan Informatika), 3(2), 433-441.
Hastomo, W., Karno, A. S. B., Kalbuana, N., Nisfiani, E., & Lussiana, E. T. P. (2021). Optimasi Deep Learning untuk Prediksi Saham di Masa Pandemi Covid-19. JEPIN (Jurnal Edukasi dan Penelitian Informatika), 7(2), 133-140.
Krisma, A., Azhari, M., & Widagdo, P. P. (2019, September). Perbandingan metode double exponential smoothing dan triple exponential smoothing dalam parameter tingkat error mean absolute percentage error (mape) dan means absolute deviation (mad). In Prosiding Seminar Nasional Ilmu Komputer dan Teknologi Informasi (Vol. 4, No. 2).
Kristen, U., Wacana, S., Tua, N., & Gaol, L. (2017). Magister Manajemen Pendidikan FKIP Teori dan Implementasi Gaya Kepemimpinan Kepala Sekolah. Ejournal. Uksw. Edu.
Leidiyana, H. (2013). Penerapan Algoritma KNN untuk Penentuan Resiko kredit Kepemilikan Kendaraan Bermotor. Jurnal Penelitian Ilmu Komputer Sistem Embedded dan Logic, 1(1), 65-76.
Madhuri, C. R., Anuradha, G., & Pujitha, M. V. (2019, March). House price prediction using regression techniques: A comparative study. In 2019 International conference on smart structures and systems (ICSSS) (pp. 1-5). IEEE.
Nagelkerke, N. J. (1991). A note on a general definition of the coefficient of determination. Biometrika, 78(3), 691-692.
Nishom, M. (2019). Perbandingan Akurasi Euclidean Distance, Minkowski Distance, dan Manhattan Distance pada Algoritma K-Means Clustering berbasis Chi-Square. Jurnal Informatika, 4(01), 20-24.
Osborne, J. W. (2000). Prediction in multiple regression. Practical Assessment, Research, and Evaluation, 7(1), 2.
Pane, S. F., Poetra, C. K., & Fatonah, R. N. S. (2021). Analisa Profit Dan Loss Pada Sistem Manajemen Aset Dengan Menggunakan Algoritma Multiple Linear Regression. Jurnal SITECH: Sistem Informasi dan Teknologi, 4(1), 1-6.
Rohman, M. A., & Harini, S. (2022). Komparasi Algoritma Naïve Bayes dan k-Nearest Neighbor Pada Klasifikasi Kontribusi Tokoh Politik. INFORMATION SYSTEM FOR EDUCATORS AND PROFESSIONALS: Journal of Information System, 7(1), 21-30.
Sumarno, S., Gimin, G., & Nas, S. (2017). Dampak Biaya Kuliah Tunggal Terhadap Kualitas Layanan Pendidikan. Kelola: Jurnal Manajemen Pendidikan, 4(2), 184-194.
Tranmer, M., & Elliot, M. (2008). Multiple linear regression. The Cathie Marsh Centre for Census and Survey Research (CCSR), 5(5), 1-5.
United Nations. (2020). UN E-Government Survey 2020. https://publicadministration.un.org.
Wiradinata, T., Graciella, F., Tanamal, R., Soekamto, Y. S., & Saputri, T. R. D. (2022). Post-Pandemic Analysis of House Price Prediction in Surabaya: A Machine Learning Approach. Journal of Southwest Jiaotong University, 57(5).
Downloads
Published
Issue
Section
License
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).
Jurnal Informatika Universitas Pamulang have CC-BY-NC or an equivalent license as the optimal license for the publication, distribution, use, and reuse of scholarly work.
In developing strategy and setting priorities, Jurnal Informatika Universitas Pamulang recognize that free access is better than priced access, libre access is better than free access, and libre under CC-BY-NC or the equivalent is better than libre under more restrictive open licenses. We should achieve what we can when we can. We should not delay achieving free in order to achieve libre, and we should not stop with free when we can achieve libre.
Jurnal Informatika Universitas Pamulang is licensed under a Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)
YOU ARE FREE TO:
- Share : copy and redistribute the material in any medium or format
- Adapt : remix, transform, and build upon the material for any purpose, even commercially.
- The licensor cannot revoke these freedoms as long as you follow the license terms