Accurate medical image segmentation is essential for reliable diagnosis, treatment planning, and disease monitoring. Existing convolutional and Transformer-based models, such as U-Net and its variants, often extract redundant spatial features and learn correlations from imaging artifacts that reduce generalization and robustness in clinical environments. Vision State-Space Models (VSSMs), including VM-UNet, improve computational efficiency and long-range dependency modeling but lack explicit mechanisms to suppress irrelevant activations and maintain compact representations. To address these issues, we present EFSVMNet and EFSVMNet-Lite, an Enhanced Feature-Selective Vision Mamba Network that introduces adaptive feature suppression within a state-space framework for robust and efficient medical image segmentation. EFSVMNet integrates four complementary components: (i) a Spatial Feature-Selective (SFS) block that filters task-irrelevant activations, (ii) a Gradient Reversal Layer (GRL) that promotes adversarial feature unlearning, (iii) a Dilated Cross-Fusion Spatial Attention (DCFSA) module that enhances multi-scale contextual fusion, and (iv) a Masked Adaptive Singular Value Decomposition (SVD) loss that enforces low-rank feature regularization. Experiments on seven benchmark datasets show consistent performance gains over VM-UNet and related baselines, with up to +4.0% mIoU and +2.5% Dice improvements. Analysis shows EFSVMNet-Lite demonstrates superior robustness under Gaussian and Poisson noise. These results demonstrate that incorporating explicit feature suppression into a state-space formulation substantially enhances segmentation reliability and computational efficiency.
Aithal et al. (Mon,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: