To address the challenges faced by unmanned surface vehicles (USVs) in complex aquatic environments—specifically, the missed detection of small objects, interference from water surface reflections and glare, and insufficient real-time performance—this paper proposes a lightweight and high-precision detection network based on an improved YOLOv5, termed YOLO-USV. First, a Cross-Stage Enhanced Channel Attention module (C3-ECA) is introduced into the backbone network to enhance the sensitivity of shallow features to small obstacles (e.g., buoys and semi-submerged objects) while suppressing background noise caused by water surface reflections. Second, a lightweight Bidirectional Feature Pyramid Network (BiFPN-Lite) is designed, which employs depthwise separable convolutions to reduce the parameter burden in the neck and utilizes cross-scale weighted fusion to improve the robustness of multi-scale obstacle detection. Furthermore, TensorRT is employed to accelerate inference. Experimental results on the FloW dataset—which includes scenes with significant water surface reflections and glare—demonstrate that YOLO-USV achieves a mAP@0.5 of 95.9% and mAP@0.5:0.95 of 62.1%, representing an improvement of 4.4% over the baseline model. The model size is only 12.6 MB, and the inference speed reaches 142.8 FPS. In addition, cross-dataset experiments were conducted, and the results demonstrate the generalization ability of the proposed algorithm, preliminarily demonstrating the feasibility and superiority of the algorithm.
Liu et al. (Sat,) studied this question.