Analisis Visual dan Karakteristik Klub Sepakbola Liga Inggris Berdasarkan Pola Permainan Menggunakan K-Means Clustering
DOI:
https://doi.org/10.32493/informatika.v9i3.44640Keywords:
English Premier League, K-Means, Playing Characteristics, Football Analytics, Cluster Evaluation, Feature EngineeringAbstract
This research aimed to analyze and cluster football teams in the English Premier League (EPL) for the 2023/2024 season based on their playing characteristics using K-Means clustering. Understanding the playing styles is essential for optimizing strategies and enhancing team performance. Preprocessing steps included data cleaning, feature engineering, and visualization of key features such as goals, shots, and attacking attempts. Four clusters were identified using the Elbow method, representing teams with varying levels of attacking and defensive capabilities. Evaluation of the clustering results was conducted using Davies-Bouldin (score: 0.47), Calinski-Harabasz (score: 275.89), and Silhouette (score: 0.53) metrics, indicating moderate clustering quality. The findings suggest that EPL teams tend to be attack-oriented, while defensive strength varies across clusters. Limitations in the dataset, such as the number of observations and features, impacted the analysis, and future studies may benefit from incorporating additional features and advanced dimensionality reduction techniques.
References
Al-Asadi MA, Tasdemir S. 2022. Predict the Value of Football Players Using FIFA Video Game Data and Machine Learning Techniques. IEEE Access. 10:22631–22645.doi:10.1109/ACCESS.2022.3154767.
Andreff W. 2011. Some comparative economics of the organization of sports: competition and regulation in north American vs. European professional team sports leagues. The European Journal of Comparative Economics. 8(1):3–27.
Baboota R, Kaur H. 2019. Predictive analysis and modelling football results using machine learning approach for English Premier League. Int J Forecast. 35(2):741–755.doi:10.1016/j.ijforecast.2018.01.003.
Bond AJ, Widdop P, Cockayne D, Parnell D. 2021. Prosumption, Networks and Value during a Global Pandemic: Lockdown Leisure and COVID-19. Leis Sci. 43(1–2):70–77.doi:10.1080/01490400.2020.1773985.
Firman Ashari I, Dwi Nugroho E, Baraku R, Yanda IN, Liwardana R. 2023. Analysis of Elbow, Silhouette, Davies-Bouldin, Calinski-Harabasz, and Rand-Index Evaluation on K-Means Algorithm for Classifying Flood-Affected Areas in Jakarta. Volume ke-7.
Foo WL, Tester E, Close GL, Cronin CJ, Morton JP. 2024. Professional Male Soccer Players’ Perspectives of the Nutrition Culture Within an English Premier League Football Club: A Qualitative Exploration Using Bourdieu’s Concepts of Habitus, Capital and Field. Sports Medicine..doi:10.1007/s40279-024-02134-w.
Herold M, Goes F, Nopp S, Bauer P, Thompson C, Meyer T. 2019. Machine learning in men’s professional football: Current applications and future directions for improving attacking play. Int J Sports Sci Coach. 14(6):798–817.doi:10.1177/1747954119879350.
Hewitt JH, Karakuş O. 2023. A machine learning approach for player and position adjusted expected goals in football (soccer). Franklin Open. 4:100034.doi:10.1016/j.fraope.2023.100034.
Kumar S, Solanki VK, Choudhary SK, Selamat A, Crespo RG. 2020. Comparative study on ant colony optimization (ACO) and k-means clustering approaches for jobs scheduling and energy optimization model in internet of things (IoT). International Journal of Interactive Multimedia and Artificial Intelligence. 6(1):107–116.doi:10.9781/ijimai.2020.01.003.
Millati K, Suhaeni C, Susetyo B. 2021. Penggerombolan Daerah 3T di Indonesia Berdasarkan Rasio Tenaga Kesehatan dengan Metode Penggerombolan Berhierarki dan Cluster Ensemble. Xplore: Journal of Statistics. 10(2):197–213.doi:10.29244/xplore.v10i2.744.
Murtagh F, Contreras P. 2012. Algorithms for hierarchical clustering: An overview. Wiley Interdiscip Rev Data Min Knowl Discov. 2(1):86–97.doi:10.1002/widm.53.
Nargesian F, Samulowitz H, Khurana U, Khalil EB, Turaga D. 2017. Learning feature engineering for classification. Di dalam: IJCAI International Joint Conference on Artificial Intelligence. Vol. 0. International Joint Conferences on Artificial Intelligence. hlm. 2529–2535.
Pratama Simanjuntak K, Khaira U. 2021. MALCOM: Indonesian Journal of Machine Learning and Computer Science Hotspot Clustering in Jambi Province Using Agglomerative Hierarchical Clustering Algorithm Pengelompokkan Titik Api di Provinsi Jambi dengan Algoritma Agglomerative Hierarchical Clustering. 1:7–16.
Rommers N, Rössler R, Verhagen E, Vandecasteele F, Verstockt S, Vaeyens R, Lenoir M, D’Hondt E, Witvrouw E. 2020. A Machine Learning Approach to Assess Injury Risk in Elite Youth Football Players. Med Sci Sports Exerc. 52(8):1745–1751.doi:10.1249/MSS.0000000000002305.
Shi C, Wei B, Wei S, Wang W, Liu H, Liu J. 2021. A quantitative discriminant method of elbow point for the optimal number of clusters in clustering algorithm. EURASIP J Wirel Commun Netw. 2021(1).doi:10.1186/s13638-021-01910-w.
Vergani AA, Binaghi E. 2018. A soft davies-bouldin separation measure. Di dalam: IEEE International Conference on Fuzzy Systems. Vol. 2018-July. Institute of Electrical and Electronics Engineers Inc.
Wu R. 2024. Behavioral analysis of electricity consumption characteristics for customer groups using the k-means algorithm. Systems and Soft Computing. 6.doi:10.1016/j.sasc.2024.200143.
Xu D, Tian Y. 2015. A Comprehensive Survey of Clustering Algorithms. Annals of Data Science. 2(2):165–193.doi:10.1007/s40745-015-0040-1.
Downloads
Published
Issue
Section
License
Copyright (c) 2024 Rachmat Bintang Yudhianto, Fajar Athallah Yusuf, Anwar Fitrianto, L.M. Risman Dwi Jumansyah
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).
Jurnal Informatika Universitas Pamulang have CC-BY-NC or an equivalent license as the optimal license for the publication, distribution, use, and reuse of scholarly work.
In developing strategy and setting priorities, Jurnal Informatika Universitas Pamulang recognize that free access is better than priced access, libre access is better than free access, and libre under CC-BY-NC or the equivalent is better than libre under more restrictive open licenses. We should achieve what we can when we can. We should not delay achieving free in order to achieve libre, and we should not stop with free when we can achieve libre.
Jurnal Informatika Universitas Pamulang is licensed under a Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)
YOU ARE FREE TO:
- Share : copy and redistribute the material in any medium or format
- Adapt : remix, transform, and build upon the material for any purpose, even commercially.
- The licensor cannot revoke these freedoms as long as you follow the license terms