Existing prompt-based approaches have demonstrated impressive performance in continual learning, leveraging pre-trained large-scale models for classification tasks; however, the tight coupling between foreground-background information and the coupled attention between prompts and image-text tokens present significant challenges in incremental medical object detection tasks, due to the conceptual gap between medical and natural domains. To overcome these challenges, we introduce the ~framework, which comprises two main components: 1) Instance-level Prompt Generation (), which decouples fine-grained instance-level knowledge from images and generates prompts that focus on dense predictions, and 2) Decoupled Prompt Attention (), which decouples the original prompt attention, enabling a more direct and efficient transfer of prompt information while reducing memory usage and mitigating catastrophic forgetting. We collect 13 clinical, cross-modal, multi-organ, and multi-category datasets, referred to as, and experiments demonstrate that ~outperforms existing SOTA methods, with FAP improvements of 5. 44\%, 4. 83\%, 12. 88\%, and 4. 59\% in full data, 1-shot, 10-shot, and 50-shot settings, respectively.
Yi et al. (Sat,) studied this question.