Key points are not available for this paper at this time.
Large-scale vision-language models (VLMs) have shown a strong zero-shot generalization capability on unseen-domain data. However, when adapting pre-trained VLMs to a sequence of downstream tasks, they are prone to forgetting previously learned knowledge and degrade their zero-shot classification capability. To tackle this problem, we propose a unique Selective Dual-Teacher Knowledge Transfer framework that leverages the most recent fine-tuned and the original pre-trained VLMs as dual teachers to preserve the previously learned knowledge and zero-shot capabilities, respectively. With only access to an unlabeled reference dataset, our proposed framework performs a selective knowledge distillation mechanism by measuring the feature discrepancy from the dual teacher VLMs. Consequently, our selective dual-teacher knowledge distillation would mitigate catastrophic forgetting of previously learned knowledge while preserving the zero-shot capabilities from pre-trained VLMs. Through extensive experiments on benchmark datasets, we show that our proposed framework is favorable against state-of-the-art continual learning approaches for preventing catastrophic forgetting and zero-shot degradation.
Building similarity graph...
Analyzing shared references across papers
Loading...
Yu-Chu Yu
Chi-Pin Huang
J.C. Chen
Building similarity graph...
Analyzing shared references across papers
Loading...
Yu et al. (Thu,) studied this question.
www.synapsesocial.com/papers/68e7420ab6db6435876bb759 — DOI: https://doi.org/10.48550/arxiv.2403.09296