What type of study is this?

September 5, 2025Open Access

An Ultra-Lightweight and High-Precision Underwater Object Detection Algorithm for SAS Images

Key Points

The proposed algorithm outperforms traditional models in precision by achieving better detection rates while using fewer parameters.
Utilizing the SCTD dataset, it demonstrates improved performance with only 10% of the baseline model's parameters, highlighting efficiency.
Key innovations include the Dilated-Attention Aggregation Feature Module for enhanced feature extraction and the Context-Aware Feature Enhancement Module.
The two-stage training strategy incorporates real and synthetic datasets, addressing challenges of limited sonar images and boosting model generalization.

Abstract

Underwater Object Detection (UOD) based on Synthetic Aperture Sonar (SAS) images is one of the core tasks of underwater intelligent perception systems. However, the existing UOD methods suffer from excessive model redundancy, high computational demands, and severe image quality degradation due to noise. To mitigate these issues, this paper proposes an ultra-lightweight and high-precision underwater object detection method for SAS images. Based on a single-stage detection framework, four efficient and representative lightweight modules are developed, focusing on three key stages: feature extraction, feature fusion, and feature enhancement. For feature extraction, the Dilated-Attention Aggregation Feature Module (DAAFM) is introduced, which leverages a multi-scale Dilated Attention mechanism for strengthening the model’s capability to perceive key information, thereby improving the expressiveness and spatial coverage of extracted features. For feature fusion, the Channel–Spatial Parallel Attention with Gated Enhancement (CSPA-Gate) module is proposed, which integrates channel–spatial parallel modeling and gated enhancement to achieve effective fusion of multi-level semantic features and dynamic response to salient regions. In terms of feature enhancement, the Spatial Gated Channel Attention Module (SGCAM) is introduced to strengthen the model’s ability to discriminate the importance of feature channels through spatial gating, thereby improving robustness to complex background interference. Furthermore, the Context-Aware Feature Enhancement Module (CAFEM) is designed to guide feature learning using contextual structural information, enhancing semantic consistency and feature stability from a global perspective. To alleviate the challenge of limited sample size of real sonar images, a diffusion generative model is employed to synthesize a set of pseudo-sonar images, which are then combined with the real sonar dataset to construct an augmented training set. A two-stage training strategy is proposed: the model is first trained on the real dataset and then fine-tuned on the synthetic dataset to enhance generalization and improve detection robustness. The SCTD dataset results confirm that the proposed technique achieves better precision than the baseline model with only 10% of its parameter size. Notably, on a hybrid dataset, the proposed method surpasses Faster R-CNN by 10.3% in mAP50 while using only 9% of its parameters.

An Ultra-Lightweight and High-Precision Underwater Object Detection Algorithm for SAS Images

Key Points

Abstract

Cite This Study