The extensive technologies in deep learning (DL) have dramatically changed the video surveillance landscape, making more accurate, efficient, and Smart surveillance systems possible. Such technologies are being increasingly utilized in public security, traffic management, access control, and anomaly detection (AD). Yet the heterogeneity of models, tasks, and domains of surveillance applications has formed a splintered research community. Smart surveillance systems play a crucial role in enhancing public safety through real-time (RT) monitoring and automated threat detection. They reduce human dependency, improve response times, and enable large-scale, efficient surveillance. These systems are vital in various applications, including traffic control, crime prevention, and pandemic management. As AI advances, smart surveillance becomes increasingly adaptive, scalable, and essential in modern security infrastructure. The objective of this review is to present an organized and extensive overview of DL methods in video surveillance, with emphasis on five major areas: object detection (OD) and tracking, human action and activity recognition, face recognition (FR) and person re-identification, AD, and scene understanding using semantic segmentation. The review also presents a comparative study of publicly available datasets used for benchmarking surveillance research. Through the identification of dominant trends, methodologies, and performance benchmarks, this paper highlights areas of ongoing technological advancements and research gaps. In addition, the review highlights the increasing demand for interdisciplinary cooperation and demands for responsible and ethical Artificial Intelligence (AI) practices in surveillance system technologies.
Garg et al. (Thu,) studied this question.