Large-scale pre-trained vision models such as ViT, CLIP, and SAM provide strong foundations for diverse vision tasks, motivating recent Mixture-of-Experts (MoE) approaches that combine multiple experts. However, existing methods often rely on static or implicit routing strategies, limiting adaptability to task semantics and input characteristics. We propose a task-adaptive vision expert routing framework based on competency learning guided by predictive uncertainty. We define expert competency as the relative reduction in predictive uncertainty induced by inter-expert interaction, and formulate expert routing as a learning problem driven by this signal. Our method uses task embeddings derived from textual descriptions to guide expert routing, refines expert features through cross-expert interaction, and aggregates them adaptively into a unified representation. By directly optimizing routing and feature composition using an uncertainty-based competency signal, the model learns how expert collaboration improves task-specific prediction reliability. Extensive experiments on diverse vision tasks demonstrate superior generalization performance and adaptive routing behavior aligned with task semantics.
Building similarity graph...
Analyzing shared references across papers
Loading...
Donghyun Han
Yuseok Bae
Jung Uk Kim
ICT Express
Kyung Hee University
Chonnam National University
Electronics and Telecommunications Research Institute
Building similarity graph...
Analyzing shared references across papers
Loading...
Han et al. (Wed,) studied this question.
www.synapsesocial.com/papers/69e470e9010ef96374d8db16 — DOI: https://doi.org/10.1016/j.icte.2026.04.007