The reliable monitoring of fish-schools, which serve as crucial ecological indicators, represents a core challenge in marine environmental monitoring. Accurate detection is crucial for evaluating species distribution, migration, and ecosystem health. However, sonar imaging technology encounters severe challenges, such as small-target scale limitations and noise interference, when detecting fish-schools in dynamic underwater ecosystems. This paper proposes an enhanced You Only Look Once (YOLO)v11 framework by integrating a Simple Parameter-Free Attention Module (SimAM) and an Adaptive Feature Pyramid Network (AFPN) based feature fusion module to the original YOLOv11, namely YOLOv11-SAS. The SimAM dynamically recalibrates feature weights to suppress noise while enhancing feature sensitivity. Simultaneously, the AFPN structure with enhanced cross-scale fusion is implemented, complemented by a dedicated small-object detection layer P 2 that strengthens shallow features for improved micro-target recognition. The combination of the SimAM and AFPN-P 2 enhances the feature learning capability. To validate the model, we develop a Dynamic Multi-Scale Dense Sonar image dataset, named DMDS, which comprises multi-scale targets with high-density clusters such as fish-schools. Experimental results demonstrate that YOLOv11-SAS achieves 96.8% accuracy on DMDS, outperforming YOLOv11 by 5.0% for all targets and showing a 27.2% improvement for dense fish-school detection. Compared with the state-of-the-art, the accuracy improves by 1.8% for all targets and 11.1% for fish-schools. This performance improvement provides a solid foundation for ecological applications, such as fishery stock assessment, migration tracking, and conservation strategy development, etc. • A novel YOLOv11-SAS framework was proposed for dense fish-schools detection. • SimAM integration improves feature weighting and suppresses background noises. • AFPN-P 2 enhances feature fusion contributed to multi-scale object detection. • A DMDS dataset was developed for underwater dynamic object detection. • YOLOv11-SAS outperforms YOLOv11 by 27.2% on detection of dense fish-schools.
Zhang et al. (Thu,) studied this question.