Deep anomaly detection aims to provide robust and efficient classifiers for zero-shot (unsupervised, UNS) and few-shot (imbalanced supervised, IMS) settings. However, current models still struggle on edge-case normal samples and are often unable to keep high performance over different scales of anomalies. Additionally, there is a lack of a unified framework that efficiently addresses both UNS and IMS settings. To address these limitations, we present a novel two-stage method which leverages multi-scale normal prototypes during training to compute an anomaly deviation score. First, we employ a novel memory-augmented contrastive learning to jointly learn representations and memory modules across multiple scales. This allows us to effectively capture subtle features of normal data while adapting to varying levels of anomaly complexity. Then, we train an efficient anomaly distance-based detector that computes spatial deviation maps between the learned prototypes and incoming observations. Our model outperforms the SoTA on a wide range of anomalies, including object, style, and local anomalies, as well as industrial inspection and face anti-spoofing, while being on par with SoTa out-of-distribution detectors. Notably, it stands as the first model capable of maintaining exceptional performance across both settings.
Jezequel et al. (Thu,) studied this question.