Birds play an essential role in evaluating the health and biodiversity of wetland ecosystems. Due to the complex and diverse wetland environments and the typically small size of birds, existing technologies face issues of low detection accuracy and high miss rates. To address these challenges, this study proposes the RLCB-YOLO model, which is a framework for detecting wetland birds based on YOLOv8n. By combining receptive field attention and coordinate attention, the proposed convolutional modules solve the problem of attention weight sharing and enhance long-range information processing. Additionally, the SPPF-LSKA module is introduced to use long-range dependencies and adaptive scaling, effectively filtering background noise in complex wetland environments. For feature fusion, an improved BiFPN-P2 structure is adopted to facilitate superior cross-scale information interaction. The framework is completed by a content-aware feature reorganization module at the up-sampling stage, ensuring precise focus on the key semantic features of small-scale targets. Experimental results showed that RLCB-YOLO achieves 82.1% mAP@0.5 and 48.6% mAP@0.5:0.95 on a self-built small wetland bird targets dataset, outperforming the baseline YOLOv8n by 3.6% and 2.9%. Furthermore, it outperforms YOLOv8s in overall efficacy while maintaining a reduced parameter count. Visualization analysis further confirms the model’s suitability for engineering applications in ecological monitoring of complex wetland scenes.
Xing et al. (Thu,) studied this question.