What question did this study set out to answer?

The aim is to construct a deep learning model to classify ground objects in high-resolution remote sensing images while addressing model deployment challenges.

April 1, 2026

Construction of deep learning model for automatic interpretation of high-resolution remote sensing images and classification of ground objects

Key Points

The aim is to construct a deep learning model to classify ground objects in high-resolution remote sensing images while addressing model deployment challenges.
Developed a lightweight multi-scale dual attention deep learning model (MS-DANet).
Used MobileNetV2 for extracting multi-level features.
Implemented parallel atrous space pyramid pooling (ASPP) and feature pyramid network (FPN) for effective feature fusion.
Integrated a channel-space dual attention module to enhance key features.
Compressed model parameters to enable deployment efficiency.
Achieved overall accuracy of 90.6% on the ISPRS Vaihingen dataset.
Average F1 score of 91.2% and mIoU index of 83.9%.
Performed better than mainstream models like U-Net and DeepLabV3+, with only 8.8% of parameters.

Abstract

This paper introduces a lightweight, multi-scale, dual attention deep learning model (MS-DANet) aiming to tackle the challenges of high inter-class similarity, significant multi-scale differences, and constraints in model deployment within the context of high-resolution remote sensing image land cover classification. This model is based on the encoder decoder structure, introducing MobileNetV2 to extract multi-level features at the encoding end, and designing parallel atrous space pyramid pooling (ASPP) and feature pyramid network (FPN) structures at the decoding end to achieve effective fusion of global semantics and local detail information; Furthermore, the channel-space dual attention module is embedded to adaptively enhance the characteristics of key features and alleviate the problem of category confusion. In order to achieve both high accuracy and high efficiency, the model introduced the mechanism of deep separable convolution (DSC) and knowledge distillation, and compressed the parameters to 4.8M. The experimental results on ISPRS Vaihingen data set show that the overall accuracy, average F1 and mIoU index of MS-DANet are 90.6%, 91.2% and 83.9% respectively, which is superior to mainstream models such as U-Net and DeepPlabv 3+, and the parameter quantity is only 8.8% of the latter, which verifies its superiority and practicability in the task of automatic interpretation of high-resolution remote sensing images.

Bookmark

Construction of deep learning model for automatic interpretation of high-resolution remote sensing images and classification of ground objects

Key Points

Abstract

Cite This Study