Remote sensing scene classification is one of the crucial techniques for high-resolution remote sensing image interpretation and has received widespread attention in recent years. However, acquiring high-quality labeled data is both costly and time-consuming, making unsupervised domain adaptation (UDA) an important research focus in scene classification. Existing UDA methods focus primarily on aligning the overall feature distributions across domains but neglect class feature alignment, resulting in the loss of critical class information. To address this issue, a cross-layer feature fusion and attention-based class feature alignment network (CFACA-NET) is proposed for unsupervised cross-domain remote sensing scene classification. Specifically, a multi-layer feature extraction module (MFEM) consisting of a cross-layer feature fusion module (CFFM), a multi-scale dynamic attention module (MSDAM), and a fused feature optimization module (FFOM) is designed to enhance the representation ability of scene features. A high-confidence sample selection module is further introduced, which utilizes evidence theory and information entropy to obtain reliable pseudo-labels. Finally, a class feature alignment module is proposed, incorporating a two-stage training strategy to achieve effective class feature alignment. Experimental results on three remote sensing scene classification datasets demonstrate that CFACA-NET outperforms existing state-of-the-art methods in cross-domain classification performance, effectively enhancing cross-domain adaptation capability.
Wei et al. (Wed,) studied this question.