August 12, 2025

MoE-Adapters++: Towards More Efficient Continual Learning of Vision-Language Models via Dynamic Mixture-of-Experts Adapters

Key Points

MoE-Adapters++ improves training efficiency while reducing long-term forgetting in vision-language models.
The method integrates dynamic expert involvement, enhancing the model's adaptation capabilities across tasks.
Utilizing a Latent Embedding Auto-Selector, the architecture dynamically routes inputs, streamlining the learning process.
Empirical results indicate a significant performance boost compared to existing state-of-the-art approaches.

Abstract

In this paper, we first propose MoE-Adapters, a parameter-efficient training framework to alleviate long-term forgetting issues in incremental learning with Vision-Language Models (VLM). Our MoE-Adapters leverages incrementally added routers to activate and integrate exclusive expert adapters from a pre-defined static expert set, enabling the pre-trained CLIP to efficiently adapt to new tasks. To preserve the zero-shot capability of VLM, a Distribution Discriminative Auto-Selector (DDAS) is introduced that automatically routes in-distribution and out-of-distribution inputs to the MoE-Adapters and the original CLIP, respectively. However, relying on a static expert set and a separate distribution selector can lead to parameter redundancy and increased training complexity. In response, we further extend an MoE-Adapters++ framework by introducing dynamic MoE-adapters, which allows experts to be adaptively involved during the continual learning process. Additionally, a Latent Embedding Auto-Selector (LEAS) is proposed that incorporates distribution selection within CLIP to create a more unified architecture. Extensive experiments across diverse settings demonstrate that the proposed method consistently surpasses previous state-of-the-art approaches while concurrently improving training efficiency.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Jiazuo Yu

Zichen Huang

Yunzhi Zhuge

Journals

IEEE Transactions on Pattern Analysis and Machine Intelligence

Actions

Institutions

Tsinghua University

Dalian University of Technology

University of Electronic Science and Technology of China

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

MoE-Adapters++: Towards More Efficient Continual Learning of Vision-Language Models via Dynamic Mixture-of-Experts Adapters

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study