What question did this study set out to answer?

To develop a multi-stage knowledge distillation framework that improves model compression in remote sensing.

January 14, 2026Open Access

Hierarchical Knowledge Distillation for Efficient Model Compression and Transfer: A Multi-Level Aggregation Approach

Key Points

To develop a multi-stage knowledge distillation framework that improves model compression in remote sensing.
Introduced Hierarchical Multi-Segment Knowledge Distillation (HIMS_KD) framework.
Applied feature-level alignment and similarity-logit alignment.
Conducted experiments on RSITMD and RSICD benchmark datasets.
HIMS_KD improved retrieval performance in remote sensing tasks.
Enhanced zero-shot classification accuracy.
Reduced deployment costs while retaining strong accuracy.

Abstract

The success of large-scale deep learning models in remote sensing tasks has been transformative, enabling significant advances in image classification, object detection, and image–text retrieval. However, their computational and memory demands pose challenges for deployment in resource-constrained environments. Knowledge distillation (KD) alleviates these issues by transferring knowledge from a strong teacher to a student model, which can be compact for efficient deployment or architecturally matched to improve accuracy under the same inference budget. In this paper, we introduce Hierarchical Multi-Segment Knowledge Distillation (HIMSKD), a multi-stage framework that sequentially distills knowledge from a teacher into multiple assistant models specialized in low-, mid-, and high-level representations, and then aggregates their knowledge into the final student. We integrate feature-level alignment, auxiliary similarity-logit alignment, and supervised loss during distillation. Experiments on benchmark remote sensing datasets (RSITMD and RSICD) show that HIMSKD improves retrieval performance and enhances zero-shot classification; and when a compact student is used, it reduces deployment cost while retaining strong accuracy.

Hierarchical Knowledge Distillation for Efficient Model Compression and Transfer: A Multi-Level Aggregation Approach

Key Points

Abstract

Cite This Study

Also Consider

Also Consider