Remote sensing object detection requires precise identification of multi-scale and multi-directional targets in complex backgrounds, demanding the model that achieves both high accuracy and real-time performance. While knowledge distillation proves effective for compressing natural image models, it exhibits limitations in more realistic remote sensing scenarios, including inadequate adaptability, biases from long-tail data distributions, and the propagation of errors from the teacher model. To address these challenges, we propose a Prompt Driven Knowledge Distillation (PDKD) framework for remote sensing object detection. This framework leverages prompt-based mechanisms to guide the student model in effectively acquiring and assimilating the teacher's knowledge, which integrates three core components: (1) Scale-Decoupled Feature Prompting (SDFP) module dynamically adjusts feature representation capabilities through scale decoupling, enabling differentiated distillation for targets of varying scales; (2) Semantic Visual Co-Prompting (SVCP) module, based on CLIP's multimodal prior knowledge, constructs category-specific semantic prompt vectors to enhance the focus on features of long-tail categories; (3) Self-Correcting Prompting (SCP) module that suppresses error propagation through a cross self-distillation mechanism. The experiments on the DOTA dataset show that with a 1x training schedule, the model achieves a 49.0% mAP. Source codes are available at https://github.com/Ningsui/PDKD.git.
Yang et al. (Thu,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: