Multi-task learning has shown great potential in promoting the performance of multiple underwater vision tasks and has attracted increasing attention. However, the lack of large-scale underwater multi-task learning datasets with multi-task ground truth annotations has limited the research and development of underwater multi-task learning. To address this issue, we introduce the first large-scale high-resolution underwater video-level multi-task learning benchmark (UVMulti) to promote the development of underwater multi-task learning, which contains 100 video sequences with 87,352 frames. The dataset provides pixel-level segmentation annotations, image enhancement annotations, and depth ground truth annotations. Aiming at the actual underwater application scenarios and data characteristics, our multi-task learning framework UVMT-Net integrates multiple learning paradigms and makes full use of limited annotated data to improve multi-task performance. Furthermore, we design an adaptive task weight adjustment strategy (AWA) to improve the performance of the main task while maintaining the performance of the auxiliary task. Extensive experiments highlight the UVMulti dataset as a robust and versatile benchmark for underwater multi-task learning.
Li et al. (Mon,) studied this question.