Key points are not available for this paper at this time.
This paper presents GridNet, a new Convolutional Neural Network (CNN) for semantic image segmentation (full scene labelling). Classical networks are implemented as one stream from the input to the output with operators applied in the stream in order to reduce the feature maps and to increase the receptive field for the final prediction. However, for image segmentation, where the task consists in providing a semantic to each pixel of an image, feature maps reduction is harmful because it to a resolution loss in the output prediction. To tackle this problem, GridNet follows a grid pattern allowing multiple interconnected streams to at different resolutions. We show that our network generalizes many well networks such as conv-deconv, residual or U-Net networks. GridNet is from scratch and achieves competitive results on the Cityscapes.
Fourure et al. (Sun,) studied this question.