Los puntos clave no están disponibles para este artículo en este momento.
Semantic segmentation is a fundamental task in computer vision that aims to assign a categorical label to each pixel in an image, facilitating dense and detailed scene understanding. This pixel-level classification is especially crucial in autonomous driving, where accurate environmental perception is vital for dependable object detection and safe decision-making. In this study, we propose MultiDecNet, a novel multi-decoder semantic segmentation framework designed to capture both macroscopic scene layouts and fine-grained spatial boundaries in complex urban environments. Drawing inspiration from classical networks, MultiDecNet incorporates a parallel dual-branch decoding strategy that simultaneously leverages the multi-scale context modeling of the Pyramid Pooling Module (PPM) and the structural refinement capabilities of Atrous Spatial Pyramid Pooling (ASPP). To explore the impact of modern backbone representations, we structurally modernize the feature extraction pipeline by introducing the contemporary ConvNeXt convolutional architecture as an alternative to traditional ResNet101 backbones. We extensively evaluate and compare the baseline configurations alongside our proposed MultiDecNet using both ResNet101 and ConvNeXt-Large backbones on the benchmark Cityscapes dataset. The quantitative assessments demonstrate that the MultiDecNet architecture consistently provides highly competitive performance within the scope of this comparative study, with the MultiDecNet-ConvNeXt variant achieving favorable overall scores among the evaluated methods. Furthermore, a granular, class-wise IoU and training dynamics analysis reveals that while traditional networks retain competitive boundaries for localized minority targets, the modern ConvNeXt backbone ensures faster convergence stability and balanced contextual mastery over large-scale driving layouts. Ultimately, these findings offer critical insights into architectural synergy and backbone selection, presenting a robust, scalable, and well-balanced solution for advanced autonomous navigation systems.
Soylu et al. (Mon,) studied this question.