What type of study is this?

September 10, 2025Open Access

Prototype-Based Two-Stage Few-Shot Instance Segmentation with Flexible Novel Class Adaptation

Key Points

This framework significantly reduces memory consumption by using embedding vectors instead of images.
The model achieves superior performance on benchmark evaluations, specifically on the COCO dataset.
A two-stage training paradigm effectively facilitates the adaptation of novel classes without requiring extra training.
Using a Region of Interest matching mechanism enhances class flexibility and integration during inference.

Abstract

Few-shot instance segmentation (FSIS) is devised to address the intricate challenge of instance segmentation when labeled data for novel classes is scant. Nevertheless, existing methodologies encounter notable constraints in the agile expansion of novel classes and the management of memory overhead. The integration workflow for novel classes is inflexible, and given the necessity of retaining class exemplars during both training and inference stages, considerable memory consumption ensues. To surmount these challenges, this study introduces an innovative framework encompassing a two-stage “base training-novel class fine-tuning” paradigm. It acquires discriminative instance-level embedding representations. Concretely, instance embeddings are aggregated into class prototypes, and the storage of embedding vectors as opposed to images inherently mitigates the issue of memory overload. Via a Region of Interest (RoI)-level cosine similarity matching mechanism, the flexible augmentation of novel classes is realized, devoid of the requirement for supplementary training and independent of historical data. Experimental validations attest that this approach significantly outperforms state-of-the-art techniques in mainstream benchmark evaluations. More crucially, its memory-optimized attributes facilitate, for the first time, the conjoint assessment of FSIS performance across all classes within the COCO dataset. Visualized instances (incorporating colored masks and class annotations of objects across diverse scenarios) further substantiate the efficacy of the method in real-world complex contexts.

Read Full Paperexternally

اسأل الذكاء الاصطناعي

Bookmark

View Full Paper