Prompt learning has emerged as an effective strategy for adapting vision-language models (VLMs) which injects learnable semantic prompts into VLMs to guide the alignment between visual and textual representations. Although existing methods have shown strong performance across various tasks, they usually focus on the representative class-level samples and overlook the atypical and hard samples in visual feature space, which hinders generalization of VLMs. To address this issue, we propose the concept of dynamic boundary prototype, which highlights ambiguous samples that are far from the class centroid and is updated at each epoch. Accordingly, we propose a Distribution-Aware Prompt Learning (DAPL) framework to calibrate the distribution of visual feature space via the definition, optimization, and updating of dynamic boundary prototypes. Firstly, we introduce Boundary-Centroid Pulling to optimize the intra-class distribution by progressively reducing the distance between boundary and centroid prototypes, thereby enhancing structural consistency within each class. Secondly, to further enhance inter-class separability, a distance-weighted contrastive loss that places greater emphasis on distinguishing adjacent classes is designed, facilitating more effective fine-grained discrimination. Thirdly, we apply Low-Rank Adaptation Fine-Tuning to adapt the vision encoder through targeted modifications to its self-attention layers. Additionally, we adopt a progressive training strategy for stable optimization. DAPL is compatible with mainstream prompt learning methods such as CoOp, CoCoOp and PromptKD, and consistently improves their average performance across 11 benchmark datasets.
Yang et al. (Thu,) studied this question.