March 3, 2026Open Access

SF-Net: Spatial-Frequency Feature Synthesis for Semantic Segmentation of High-Resolution Remote Sensing Imagery

Key Points

Achieving state-of-the-art mean Intersection over Union (mIoU) scores of 83.12% across various benchmarks, including ISPRS Vaihingen.
SF-Net synthesizes features from both spatial and frequency domains, addressing limitations in conventional approaches.
This analysis employs a multi-scale Convolutional Grouping Fusion Module (CGFM) to effectively capture spatial features at different resolutions.
Highlighting the need for improved representation, SF-Net supports robust environmental surveillance in remote sensing applications.

Abstract

Precise semantic segmentation of High-Resolution Remote Sensing(HRRS) images is essential for robust environmental surveillance and detailed land use mapping. Despite substantial advances in deep learning, most conventional approaches focus on the spatial domain. This focus often neglects the rich textural and structural nuances found in the frequency domain, which reduces the representation of comprehensive data. Addressing this issue, we introduce SF-Net. This network synthesizes features across spatial and frequency domains, aiming for seamless and effective integration. The core of SF-Net employs a multiscale Convolutional Grouping Fusion Module (CGFM) to extract spatial features at varying resolutions. Following this, the Haar Wavelet Transform decomposes these features into distinct low-frequency components (structure) and high-frequency components (detail). Subsequently, a Mamba-enhanced Global Spatial Feature Extraction Module (GSFEM) reinforces low-frequency semantic information with global context, while a Spatial-Frequency Fusion Module (S-FFM) applies targeted attention to sharpen high-frequency details. Experimental results on the ISPRS Vaihingen, LoveDA, and Potsdam benchmarks confirm SF-Net's superior performance, achieving state-of-the-art mean Intersection over Union (mIoU) scores of 83.12%, 53.28%, and 83.35%, respectively, validating its effectiveness and superority.

SF-Net: Spatial-Frequency Feature Synthesis for Semantic Segmentation of High-Resolution Remote Sensing Imagery

Key Points

Abstract

Cite This Study