Systematic Literature Review : Tren Perkembangan Model dan Algoritma Analisis Video Kerumunan-Padat
Keywords:
video analysis, dense crowd, deep learning, tracking, systematic literature reviewAbstract
Dense-crowd video analysis is a branch of computer vision that has various important applications in public safety, emergency management, urban planning, pedestrian traffic engineering, and crowd management at large events, such as religious activities, music concerts, and sports matches. This study presents a Systematic Literature Review (SLR) of 30 scientific publications published between 2010 and 2025. The main objective of this review is to identify the latest research trends, classification of algorithms used, application domains, and the main challenges still faced in crowd video analysis. The results of this SLR show that deep learning-based approaches, such as Convolutional Neural Networks (CNN), Long Short-Term Memory (LSTM), and Transformer, still dominate various applications, especially in anomaly detection which aims to recognize suspicious behavior in dense crowds. This technology has significant potential for preventing dangerous events such as riots, mass panic, or accidents. In addition, trends in the integration of new technologies are also found, such as the use of hybrid algorithms that combine several approaches, federated learning for distributed model training, and the use of multimodal data and drones to improve monitoring effectiveness. However, many challenges remain, such as limited representative datasets, decreased accuracy under extreme conditions, computational limitations for real-time applications, and issues of privacy and model interpretability. Therefore, the results of this SLR are expected to make a strategic contribution to the development of more sophisticated, adaptive, and relevant crowd analytics systems.
References
] Y. Xiao and M. Watson, “Guidance on Conducting a Systematic Literature Review,” Mar. 01, 2017, SAGE Publications Inc. doi: 10.1177/0739456X17723971.
[2] M. Elmezain, A. S. Maklad, M. Alwateer, M. Farsi, and H. M. Ibrahim, “Analyzing Crowd Behavior in Highly Dense Crowd Videos Using 3D ConvNet and Multi-SVM,” Electronics (Switzerland), vol. 13, no. 24, Dec. 2024, doi: 10.3390/electronics13244925.
[3] A. C. Cob-Parro, C. Losada-Gutiérrez, and M. Marrón-Romera, “Stampede detector based on deep learning models using dense optical flow,” Eng Appl Artif Intell, vol. 142, Feb. 2025, doi: 10.1016/j.engappai.2024.109940.
[4] H. T. Dang, B. Gaudou, and N. Verstaevel, “HyPedSim: A Multi-Level Crowd-Simulation Framework—Methodology, Calibration, and Validation †,” Sensors, vol. 24, no. 5, Mar. 2024, doi: 10.3390/s24051639.
[5] H. Aljuaid et al., “Postures anomaly tracking and prediction learning model over crowd data analytics,” PeerJ Comput Sci, vol. 9, 2023, doi: 10.7717/peerj-cs.1355.
[6] G. Castellano, E. Cotardo, C. Mencar, and G. Vessio, “Density-based clustering with fully-convolutional networks for crowd flow detection from drones,” Neurocomputing, vol. 526, pp. 169–179, Mar. 2023, doi: 10.1016/j.neucom.2023.01.059.
[7] O. Dufour et al., “Dense Crowd Dynamics and Pedestrian Trajectories: A Multiscale Field Dataset from the Festival of Lights in Lyon,” Scientific Data , vol. 12, no. 1, Dec. 2025, doi: 10.1038/s41597-025-04732-3.
[8] A. Fagette, N. Courty, D. Racoceanu, and J. Y. Dufour, “Unsupervised dense crowd detection by multiscale texture analysis,” Pattern Recognit Lett, vol. 44, pp. 126–133, Jul. 2014, doi: 10.1016/j.patrec.2013.09.020.
[9] T. Fan et al., “Getting Robots Unfrozen and Unlost in Dense Pedestrian Crowds,” Sep. 2018, [Online]. Available: http://arxiv.org/abs/1810.00352
[10] E. Felemban, S. D. Khan, A. Naseer, F. U. Rehman, and S. Basalamah, “Deep Trajectory Classification Model for Congestion Detection in Human Crowds,” Computers, Materials and Continua, vol. 68, no. 1, pp. 705–725, Mar. 2021, doi: 10.32604/cmc.2021.015085.
[11] M. Flagg and J. M. Rehg, “Video-based crowd synthesis,” IEEE Trans Vis Comput Graph, vol. 19, no. 11, pp. 1935–1947, 2013, doi: 10.1109/TVCG.2012.317.
[12] Y. Hu, H. Chang, F. Nian, Y. Wang, and T. Li, “Dense crowd counting from still images with convolutional neural networks,” J Vis Commun Image Represent, vol. 38, pp. 530–539, Jul. 2016, doi: 10.1016/j.jvcir.2016.03.021.
[13] K. V. Joshi and N. M. Patel, “Supervised Deep Learning Approaches for Anomaly Detection and Recognition in Crowd Scenes,” Electronic Letters on Computer Vision and Image Analysis, vol. 24, no. 1, pp. 31–50, 2025, doi: 10.5565/REV/ELCVIA.1631.
[14] X. Li, Y. Liang, M. Zhao, C. Wang, H. Bai, and Y. Jiang, “Simulation of evacuating crowd based on deep learning and social force model,” IEEE Access, vol. 7, pp. 155361–155371, 2019, doi: 10.1109/ACCESS.2019.2949106.
[15] M. Magdy, M. W. Fakhr, and F. A. Maghraby, “Violence 4D: Violence detection in surveillance using 4D convolutional neural networks,” IET Computer Vision, vol. 17, no. 3, pp. 282–294, Apr. 2023, doi: 10.1049/cvi2.12162.
[16] L. Mei, M. Yu, L. Jia, and M. Fu, “Crowd Density Estimation via Global Crowd Collectiveness Metric,” Drones, vol. 8, no. 11, Nov. 2024, doi: 10.3390/drones8110616.
[17] H. S. Modi and D. A. Parikh, “A Survey on Crowd Anomaly Detection,” International Journal of Computing and Digital Systems, vol. 12, no. 4, pp. 1081–1096, Oct. 2022, doi: 10.12785/ijcds/120187.
[18] M. Nishimura, S. Nobuhara, and K. Nishino, “ViewBirdiformer: Learning to recover ground-plane crowd trajectories and ego-motion from a single ego-centric view,” Oct. 2022, [Online]. Available: http://arxiv.org/abs/2210.06332
[19] Z. Qi, M. Zhou, G. Zhu, and Y. Xue, “Multiple Pedestrian Tracking in Dense Crowds Combined with Head Tracking,” Applied Sciences (Switzerland), vol. 13, no. 1, Jan. 2023, doi: 10.3390/app13010440.
[20] G. Ren, X. Lu, and Y. Li, “Research on 24-Hour Dense Crowd Counting and Object Detection System Based on Multimodal Image Optimization Feature Fusion,” Sci Program, vol. 2022, 2022, doi: 10.1155/2022/9863066.
[21] C. Savitha and D. Ramesh, “Crowd behavior analysis using moDTA approach,” Indonesian Journal of Electrical Engineering and Computer Science, vol. 15, no. 1, pp. 484–494, Jul. 2019, doi: 10.11591/ijeecs.v15.i1.pp484-494.
[22] A. A. Shah, “A Machine Learning Model for Crowd Density Classification in Hajj Video Frames,” 2024. [Online]. Available: www.ijacsa.thesai.org
[23] A. Tomar, S. Kumar, and B. Pant, “PeopleNet: A Novel People Counting Framework for Head-Mounted Moving Camera Videos,” International Journal of Interactive Multimedia and Artificial Intelligence, vol. 8, no. 6, pp. 61–73, 2024, doi: 10.9781/ijimai.2023.04.002.
[24] H. Ullah, M. Uzair, M. Ullah, A. Khan, A. Ahmad, and W. Khan, “Density independent hydrodynamics model for crowd coherency detection,” Neurocomputing, vol. 242, pp. 28–39, Jun. 2017, doi: 10.1016/j.neucom.2017.02.023.
[25] M. Wang, F. Chang, and Y. Zhang, “Crowd escape event detection based on direction-collectiveness model,” KSII Transactions on Internet and Information Systems, vol. 12, no. 9, pp. 4355–4374, Sep. 2018, doi: 10.3837/tiis.2018.09.013.
[26] P. Washington et al., “Crowd annotations can approximate clinical autism impressions from short home videos with privacy protections,” Intell Based Med, vol. 6, Jan. 2022, doi: 10.1016/j.ibmed.2022.100056.
[27] V. W. H. Wong and K. H. Law, “Fusion of CCTV Video and Spatial Information for Automated Crowd Congestion Monitoring in Public Urban Spaces,” Algorithms, vol. 16, no. 3, Mar. 2023, doi: 10.3390/a16030154.
[28] B. Xu, D. Liang, L. Li, R. Quan, and M. Zhang, “An Effectively Finite-Tailed Updating for Multiple Object Tracking in Crowd Scenes,” Applied Sciences (Switzerland), vol. 12, no. 3, Feb. 2022, doi: 10.3390/app12031061.
[29] Y. Xue, P. Liu, Y. Tao, and X. Tang, “Abnormal prediction of dense crowd videos by a purpose-driven lattice Boltzmann model,” International Journal of Applied Mathematics and Computer Science, vol. 27, no. 1, pp. 181–194, Mar. 2017, doi: 10.1515/amcs-2017-0013.
[30] P. Zhang, W. Lei, X. Zhao, L. Dong, and Z. Lin, “An Adaptive Multi-Scale Network Based on Depth Information for Crowd Counting †,” Sensors, vol. 23, no. 18, Sep. 2023, doi: 10.3390/s23187805.
[31] R. Zhu, K. Yin, H. Xiong, H. Tang, and G. Yin, “Masked Face Detection Algorithm in the Dense Crowd Based on Federated Learning,” Wirel Commun Mob Comput, vol. 2021, 2021, doi: 10.1155/2021/8586016.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Doni Fristiyanto, Abu Khalid Rivai

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
