What question did this study set out to answer?

The aim is to improve remote sensing object detection using a new knowledge distillation framework that overcomes existing limitations.

March 15, 2026

Prompt Driven Knowledge Distillation for Remote Sensing Object Detection

Key Points

The aim is to improve remote sensing object detection using a new knowledge distillation framework that overcomes existing limitations.
Developed a Prompt Driven Knowledge Distillation framework with three components: SDFP, SVCP, and SCP.
Implemented scale decoupling to tailor feature representation for different target sizes.
Utilized CLIP's prior knowledge to create semantic prompts for improved categorization of long-tail data.
Incorporated a self-correction mechanism to minimize error propagation in the student model.
Achieved a mean Average Precision (mAP) of 49.0% on the DOTA dataset with a 1x training schedule.
Demonstrated effective handling of long-tail data through semantic prompting.
Showed improved real-time performance and accuracy in complex remote sensing environments.

Abstract

Remote sensing object detection requires precise identification of multi-scale and multi-directional targets in complex backgrounds, demanding the model that achieves both high accuracy and real-time performance. While knowledge distillation proves effective for compressing natural image models, it exhibits limitations in more realistic remote sensing scenarios, including inadequate adaptability, biases from long-tail data distributions, and the propagation of errors from the teacher model. To address these challenges, we propose a Prompt Driven Knowledge Distillation (PDKD) framework for remote sensing object detection. This framework leverages prompt-based mechanisms to guide the student model in effectively acquiring and assimilating the teacher's knowledge, which integrates three core components: (1) Scale-Decoupled Feature Prompting (SDFP) module dynamically adjusts feature representation capabilities through scale decoupling, enabling differentiated distillation for targets of varying scales; (2) Semantic Visual Co-Prompting (SVCP) module, based on CLIP's multimodal prior knowledge, constructs category-specific semantic prompt vectors to enhance the focus on features of long-tail categories; (3) Self-Correcting Prompting (SCP) module that suppresses error propagation through a cross self-distillation mechanism. The experiments on the DOTA dataset show that with a 1x training schedule, the model achieves a 49.0% mAP. Source codes are available at https://github.com/Ningsui/PDKD.git.

Bookmark

Cite This Study

Yang et al. (Thu,) studied this question.

synapsesocial.com/papers/69b64c9ab42794e3e660ddd8 https://doi.org/https://doi.org/10.1109/tip.2026.3671649

Also Consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

Bookmark