Key points are not available for this paper at this time.
• A new video anomaly detection model via spatio-temporal relationships among objects. • An attention module makes the model focus on the spatio-temporal relationships. • A dynamic pattern generator is designed to memorize spatio-temporal relationships. • Extensive experiments show the effectiveness and advantage of the proposed method. Video anomaly detection is to automatically identify predefined anomalous contents (e.g. abnormal objects, behaviors and scenes) in videos. The performance of video anomaly detection can be effectively improved by making the model focus more on the anomalous objects in videos. However, such existing approaches usually rely on pre-trained models, which not only require additional auxiliary information but also face the challenge of anomaly diversity in the real world. In this paper, we propose a new video anomaly detection method based on spatio-temporal relationships among objects. Concretely, we use a fully convolutional encoder-decoder network with symmetric skip connections as the backbone network, which can effectively extract features from the object regions at different scales. In the encoding stage, an attention mechanism is used to enhance the model’s understanding of the spatio-temporal relationships among various types of objects in the video. In the decoding stage, a dynamic pattern generator is designed to memorize the inter-object spatio-temporal relationships, which thus enhances the reconstructions of normal samples while making the reconstructions of abnormal samples more difficult. We conduct extensive experiments on three widely used video anomaly detection datasets CUHK Avenue , ShanghaiTech Campus and UCSD Ped2 , and the experimental results show that our proposed method can significantly improve the performance, and achieves state-of-the-art overall performance (considering both effectiveness and efficiency). In particular, our method achieves a state-of-the-art AUC of 98.4% on the UCSD Ped2 dataset that consists of various anomalies in real-world scenarios.
Wang et al. (Thu,) studied this question.
Synapse has enriched 2 closely related papers on similar clinical questions. Consider them for comparative context: