Salient object detection in optical remote sensing images remains challenging due to complex backgrounds, blurred boundaries, small objects, unstable foreground–background contrast, and dense object distributions. Existing convolution-based methods are effective at modeling local structures, but they are limited in capturing long-range dependencies, whereas Transformer-based approaches usually incur substantial computational cost when handling high-resolution remote sensing imagery. To address these issues, this paper proposes EGMamba-Net, an edge-guided global–local collaborative network for salient object detection in optical remote sensing images. Specifically, a hybrid global–local backbone is first constructed to preserve shallow texture, edge, and geometric details while introducing Mamba-based global modeling in deeper stages for efficient long-range dependency representation. An Edge Prior Enhancement Module (EPEM) is then designed to explicitly extract boundary priors from shallow features and refine feature representations through edge-guided modulation. To alleviate the representation conflict between global semantics and local details, a Global–Local Interaction Module (GLIM) is further developed, where convolutional local modeling and Mamba-based global modeling interact through cross-gating for complementary feature learning. Moreover, a Region-Adaptive Routing Decoder (RARD) is introduced to dynamically assign different refinement paths according to regional saliency response, boundary intensity, and contextual complexity, thereby improving the recovery of small, low-contrast, and densely distributed objects. In addition, a Difficulty-Aware Joint Loss (DAJL) is designed to enhance optimization on boundary regions and hard samples, improving robustness under challenging conditions. Extensiveexperiments on ORSSD, EORSSD, and ORSI-4199 datasets demonstrate the superiority of the proposed method. In particular, on the more challenging EORSSD dataset, EGMamba-Net achieves 0.9389 S-measure, 0.8972 max F-measure, and 0.0066 MAE. Compared with the representative remote-sensing method DAF-Net, it improves S-measure and max F-measure by 0.0223 and 0.0358, respectively, indicating stronger capability in background suppression, structural preservation, and boundary recovery.
Zhang et al. (Thu,) studied this question.