Anomaly detection under conditions of limited data volume represents a pressing challenge across numerous applied domains, including medical diagnostics. Machine learning methods typically rely on the availability of annotated anomalous samples for training, which is often impractical. Existing anomaly detection techniques designed for few-shot or zero-shot scenarios suffer from various limitations. In particular, the common assumption of normally distributed data reduces the accuracy of anomaly classification. In this study, the task of improving the accuracy and completeness of anomaly detection in previously unseen images by leveraging a combination of the Contrastive Language-Image Pretraining (CLIP) and the domain-specific transformer BERT Pre-Training of Image Transformers (BeiT) models. The integration of CLIP and BeiT models enables simultaneous binary segmentation and anomaly classification. Enhanced anomaly detection is achieved through the use of weighted embeddings from each module. Additionally, the automated generation of textual representations based on a Large Language Model significantly enhances the generalization capacity of the system. The performance of the proposed models was evaluated on the Benchmarks for Medical Anomaly Detection test set. For the dermatological domain, a test set was constructed from ISIC-18, ISIC-19, SD-198, and 7-point criteria database. The proposed method demonstrated an average improvement in the ROC-AUC metric by 10.95 % at the image-level and by 0.66 % at the pixel-level compared to existing state-of-the-art solutions. Experimental results confirm the high effectiveness of the proposed approach in anomaly classification and segmentation tasks, showing superior average metric values. Inference analysis revealed that the incorporation of a variational autoencoder within the CLIP+BeiT architecture for centroid generation enhances the model stability in few-shot scenarios. The practical significance of the proposed method lies in its adaptability and robustness to changing data distributions, making it a promising solution for automated anomaly analysis in medical diagnostics, industrial monitoring, and other domains characterized by high data uncertainty.
Building similarity graph...
Analyzing shared references across papers
Loading...
Sergey A. Milantev
P. D. Mikhailova
I.A. Bessmertny
Scientific and technical journal of information technologies mechanics and optics
ITMO University
Saint Petersburg State Electrotechnical University
Building similarity graph...
Analyzing shared references across papers
Loading...
Milantev et al. (Fri,) studied this question.
www.synapsesocial.com/papers/68c183f09b7b07f3a060f75b — DOI: https://doi.org/10.17586/2226-1494-2025-25-4-684-693