April 12, 2024Open Access

Adaptive Multi-Scale Feature Fusion with Spatial Translation for semantic Segmentation

Key Points

Key points are not available for this paper at this time.

Abstract

Abstract In image segmentation tasks,the extraction of multi-scale features enables models to better adapt to targets of diverse scales and capture semantic information more comprehensively. Additionally, the rational utilization of receptive fields assists models in better comprehending both local and global image structures.In this work, we combine the advantages of these two approaches and propose a novel adaptive module termed the MFFM. Module similar to the human eye’s visual system adaptively adjusts the focus and perceptual range to maximize the capture of target features. However, fixed-size convolutional kernels may result in information loss or confusion, leading to inaccuracies in segmentation outcomes, especially when dealing with highly similar images. To address this issue,we introduce spatial shift mechanism to perform pixel-level translation of the feature map,and by taking into account the relative relationship between pixels, the network can learn more discriminative features, thereby enhancing segmentation accuracy. Based on this, we propose a network model called AMFFNet. We demonstrate the effectiveness of the proposed model on PASCAL VOC 2012 and ADE20K datasets, achieving the test set performance of 91.7% and 46.76% MIoU without any post-processing.

Adaptive Multi-Scale Feature Fusion with Spatial Translation for semantic Segmentation

Key Points

Abstract

Cite This Study

Also Consider

Also Consider