April 26, 2024Open Access

Parameter Efficient Fine-tuning of Self-supervised ViTs without Catastrophic Forgetting

Key Points

Performance drop is minimal when using Block Expansion, effectively mitigating catastrophic forgetting in pre-trained vision transformers.
Using Block Expansion or low-rank adaptation offers better parameter efficiency compared to fully fine-tuned vision transformers on new tasks.
Fine-tuning on CIFAR-100 results in over 70% accuracy loss for models pre-trained on ImageNet-1k, highlighting the issue of catastrophic forgetting in vision transformers' training process and parameter management strategies used in this work account for that loss while favoring model efficiency. Notably, self-supervised pre-trained vision transformers perform significantly well even after introducing new tasks.

Abstract

Artificial neural networks often suffer from catastrophic forgetting, where learning new concepts leads to a complete loss of previously acquired knowledge. We observe that this issue is particularly magnified in vision transformers (ViTs), where post-pre-training and fine-tuning on new tasks can significantly degrade the model's original general abilities. For instance, a DINO ViT-Base/16 pre-trained on ImageNet-1k loses over 70% accuracy on ImageNet-1k after just 10 iterations of fine-tuning on CIFAR-100. Overcoming this stability-plasticity dilemma is crucial for enabling ViTs to continuously learn and adapt to new domains while preserving their initial knowledge. In this work, we study two new parameter-efficient fine-tuning strategies: (1)~Block Expansion, and (2) Low-rank adaptation (LoRA). Our experiments reveal that using either Block Expansion or LoRA on self-supervised pre-trained ViTs surpass fully fine-tuned ViTs in new domains while offering significantly greater parameter efficiency. Notably, we find that Block Expansion experiences only a minimal performance drop in the pre-training domain, thereby effectively mitigating catastrophic forgetting in pre-trained ViTs.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Reza Akbarian Bafghi

University of Colorado Boulder

Nidhin Harilal

University of Colorado Boulder

Claire Monteleoni

Institut national de recherche en sciences et technologies du numérique

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Parameter Efficient Fine-tuning of Self-supervised ViTs without Catastrophic Forgetting

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study