Efficient and accurate underwater target recognition is crucial for assessing the stability of marine ecosystems. However, traditional manual diving surveys are costly, time-consuming and limited in coverage. With the advent of new AI technologies, intelligent underwater robots have emerged as a novel solution to this problem, but their perceptual capabilities usually rely on extensive and accurate annotated data, which is extremely challenging for underwater datasets with poor imaging quality and scarce pixel-level annotated samples. To address the above issues, this paper proposes a semi-supervised semantic segmentation framework specifically used for underwater ecological monitoring. Specifically, a multi-modal perturbation consistency method based on a multi-branch structure is constructed to simultaneously process multiple input modalities and extract intermediate features. A novel pseudo-label screening strategy based on semantic similarity is proposed, which utilizes the mapping relationship between labeled and unlabeled images to screen reliable pseudo-labels. Experiments were conducted on three underwater scene datasets, and tests were carried out on our self-developed underwater platform. The mIoU values on the SUIM dataset under 1/2 and 1/4 splits reached 72.06% and 71.45% respectively, outperforming the existing mainstream methods. In the future, this method can be deployed on underwater platforms to provide technical support for the implementation of automated and large-scale marine ecological monitoring systems. The codes are available at: https://github.com/A0268/uwea .
Zhao et al. (Sun,) studied this question.