Key points are not available for this paper at this time.
Abstract Few-shot Segmentation (FSS) aims to achieve object segmentation in images with extremely a few labeled samples, such as one or five samples. Currently, most of the FSS methods are based on prototype representations. Prototype-based methods for FSS are simple yet effective, but overlook a crucial problem that is the intra-class variation. To address this problem, we propose an Adaptive Prototype Aggregation Network (APANet), which can adaptively alleviate the intra-class discrepancy for FSS by enhancing the prototype representation. Specifically, we design a cross-attention module for visual-text alignment that is associated by support pseudo mask and text embedding, and further generate text-visual aligned prototypes by making full use of class-specific support and query information. Experimental results demonstrate that the proposed APANet achieves superior performance on both PASCAL-5 i and COCO-20 i datasets, surpassing the state-of-the-art FSS models.
Li et al. (Thu,) studied this question.