March 3, 2026Open Access

Multi-shape enhancement pyramid network for real-time semantic segmentation

Key Points

MSEPNet achieves a segmentation accuracy of 76.7% and 72.5% mean Intersection over Union on benchmark datasets.
Utilizing only 1.04 million parameters, MSEPNet processes at speeds of 144.4 and 108.9 frames per second.
Implementation includes advanced modules like the efficient spatial inverted residual and the multi-shape enhancement pyramid.
Results demonstrate MSEPNet's robustness and generalization in diverse real-world scenarios.

Abstract

Based on powerful convolutional neural networks (CNNs) and complex model structures, semantic segmentation achieves good segmentation accuracy, but its slow inference speed limits its use in practical applications, such as autonomous driving and medical diagnosis. Thus, real-time semantic segmentation receives increasing attention. However, most existing real-time semantic segmentation methods improve inference speed while significantly sacrificing segmentation precision. Striking a well balance between inference speed and precision remains a major issue in real-time semantic segmentation. To address this issue, we propose a real-time semantic segmentation network, the Multi-Shape Enhancement Pyramid Network (MSEPNet). First, we propose an efficient spatial inverted residual (ESIR) module to effectively extract multi-scale spatial information. Next, to capture multi-scale semantic information while maintaining efficient inference speed, we introduce an efficient contextual residual (ECR) module. Finally, we present the multi-shape enhancement pyramid (MSEP) module to capture multi-scale and multi-shape contextual information. The proposed MSEPNet achieves competitive results on street scene datasets. Specifically, with only 1.04 million (1.04M) parameters, it achieves the accuracy of 76.7% and 72.5% mean Intersection over Union (mIoU) with the speed of 144.4 and 108.9 Frames Per Second (FPS) on Cityscapes and Cambridge-driving Labeled Video Database (CamVid) test sets, respectively. Furthermore, we conduct additional experiments on the Stanford Background dataset to verify the robustness of MSEPNet in diverse real-world environments, demonstrating its generalization ability beyond standard benchmarks.

Read Full Paperexternally

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Cite This Study

Chen et al. (Thu,) studied this question.

synapsesocial.com/papers/69a767e5badf0bb9e87e2d01 https://doi.org/https://doi.org/10.1016/j.engappai.2026.114059

Discussion

Journals

Engineering Applications of Artificial Intelligence

Institutions

Guangzhou University

References and Citations

Add This Paper to Your Research Feed

Any time a new paper drops it will be there.