Continual learning for open-vocabulary object detection aims to enable pretrained vision–language detectors to adapt to diverse specialized domains while preserving their zero-shot generalization capabilities. However, existing methods primarily focus on mitigating catastrophic forgetting, often neglecting the substantial domain shifts commonly encountered in real-world applications. To address this critical oversight, we pioneer Open-Domain Continual Object Detection (OD-COD), a new paradigm that requires detectors to continually adapt across domains with significant stylistic gaps. We propose Disentangled Domain Knowledge-Aided Learning (D-Know) to tackle this challenge. This framework explicitly disentangles domain-general priors from category-specific adaptation, managing them dynamically in a scalable domain knowledge base. Specifically, D-Know first learns domain priors in a self-supervised manner and then leverages these priors to facilitate category-specific adaptation within each domain. To rigorously evaluate this task, we construct OD-CODB, the first dedicated benchmark spanning six domains with substantial visual variations. Extensive experiments demonstrate that D-Know achieves superior performance, surpassing current state-of-the-art methods by an average of 4.2% mAP under open-domain continual settings while maintaining strong zero-shot generalization. Furthermore, experiments under the few-shot setting confirm D-Know’s superior data efficiency.
He et al. (Mon,) studied this question.