Test-time few-shot object detection (FSOD) represents an innovative approach for identifying novel categories using a limited number of support examples, obviating the need for model fine-tuning. Despite advancements, existing FSOD methods, including our prior work, continue to grapple with challenges posed by domain/category shift and limited data availability. Building upon our previous research on test-time FSOD, this article proposes a novel dynamic prototype fusion network (PFN) to overcome these limitations. To mitigate the impact of the distribution shift, a dynamic prototype refinement method is introduced that updates prototypes from supporting images in an adaptive manner. Further, limited samples are mitigated through exhaustive exploitation of information within support images. Specifically, we design a dual-level multiscale information integration approach that effectively fuses information across different network layers and image scales, enhancing the model's discriminating capabilities. Additionally, a mask-based preprocessing technique harnesses segmentation labels on support samples, effectively suppressing the adverse impact of background noise on model accuracy. Notably, to align with the constraints of test-time scenarios, model parameters remain fixed during the configuration step, with only prototypes being updated each time users input novel supporting samples. As a result, our method achieves superior performance over existing state-of-the-art FSOD methods on multiple benchmarks, demonstrating remarkable potential in the realm of FSOD. The code is available at https://github.com/CatfishW/TIDEV2.
Wu et al. (Thu,) studied this question.