August 17, 2025Open Access

Method of adaptive knowledge distillation from multi-teacher to student deep learning models

Key Points

EMTKD achieves 88.5% accuracy on cardiac MRI while integrating adaptive techniques for robust learning.
The area under the curve reached 92.5%, outperforming state-of-the-art methods by 5.0% on the same data.
This approach employs a holistic framework with domain adaptation and semi-supervised learning components.
Significantly enhances performance in data-scarce environments, indicating broader implications for deep learning applications.

Abstract

Transferring knowledge from multiple teacher models to a compact student model is often hindered by domain shifts between datasets and a scarcity of labeled target data, degrading performance. While existing methods address parts of this problem, a unified framework is lacking. In this work, we improve multi-teacher knowledge distillation by developing a holistic framework, enhanced multi-teacher knowledge distillation (EMTKD), that synergistically integrates three components: domain adaptation within teacher training, an instance-specific adaptive weighting mechanism for knowledge fusion, and semi-supervised learning to leverage unlabeled data. On a challenging cross-domain cardiac MRI benchmark, EMTKD achieves a target domain accuracy of 88.5% and an area under the curve of 92.5%, outperforming state-of-the-art techniques by up to 5.0%. Our results demonstrate that this integrated, adaptive approach yields significantly more robust and accurate student models, enabling effective deep learning deployment in data-scarce environments.

Method of adaptive knowledge distillation from multi-teacher to student deep learning models

Key Points

Abstract

Cite This Study