The rapid proliferation of CCTV surveillance cameras across public and private spaces has rendered continuous manual monitoring inefficient, labour-intensive, and prone to operator fatigue. This work presents a real-time CCTV anomaly detection framework that automatically classifies surveillance video as Normal or Anomalous using a hybrid spatio-temporal deep learning pipeline. The proposed model couples a ResNet-based convolutional encoder for per-frame spatial feature extraction with a Bidirectional Long Short-Term Memory (Bi-LSTM) network for forward-and-backward temporal sequence modelling, operating on uniformly sampled sequences of 20 frames at a resolution of 224×224. The pipeline is trained on the UCF-Crime Anomaly Detection Dataset (Kaggle), comprising 13 anomaly categories alongside normal activity videos. The trained model is deployed as a full-stack web application: a Flask REST API backend (port 5000) serves the inference engine and exposes endpoints for both video-file upload and live webcam ingestion, while a Vue.js 3 frontend (Vite, port 5173) delivers a dual-mode user interface with drag-and-drop upload, WebRTC live capture, real-time backend health monitoring, and colour-coded result rendering. The system operates without specialized GPU hardware and avoids cloud dependence, making it suitable for low-cost local deployment. Experimental validation on the UCF-Crime test set confirms reliable binary classification with high inference responsiveness, and the modular two-tier architecture (app.py for routing, inference.py for model logic) supports straightforward future model upgrades. Compared to existing benchmark approaches such as the Sultani et al. MIL framework and frame-only CNN detectors, the proposed system simultaneously delivers spatio-temporal modelling, dual-mode input, and an accessible web interface in one deployable package.
Mamatha et al. (Mon,) studied this question.