Anomaly detection (AD), referred to as detecting anomalies from images or videos, is commonly considered a one-class classification task (i.e the model is only trained on the normal training data to identify abnormal data during the inference period). A distinguished category of the existing works is the reconstruction-based method where models are trained to reconstruct the inputs and leverage the reconstruction error with the target as an abnormality score. However, without considering global information, these methods may fail due to the generalization capability of the reconstruction model. To tackle this problem, we propose a proxy task of feature mimicking that can be integrated into a wide range of anomaly detection frameworks and utilizes their inherently discriminative hidden-layer features. Moreover, a novel attention module that takes the feature inconsistency matrix generated by the feature-mimicking task as input is presented. The feature inconsistency guided attention module enables the reconstruction-based model to focus on the region or pattern where the global, semantic feature inconsistency is higher. We integrate our method into several state-of-the-art methods for anomaly detection on images and videos. The empirical results show that our method can bring improvement and achieve new SOTA performance on MVTec AD, CUHK Avenue and ShanghaiTech.
Zheng et al. (Thu,) studied this question.