Deep learning methods based on convolutional neural networks (CNNs) and Mamba have advanced medical image segmentation, yet two challenges remain: (1) trade-off in feature extraction, where CNNs capture local details but miss global context, and Mamba captures global dependencies but overlooks fine structures, and (2) limited feature aggregation, as existing methods insufficiently integrate inter-layer common information and delta details, hindering robustness to subtle structures. To address these issues, we propose a hybrid cascade and dual-path adaptive aggregation network (HCDAA-Net). For feature extraction, we design a hybrid cascade structure (HCS) that alternately applies ResNet and Mamba modules, achieving a spatial balance between local detail preservation and global semantic modeling. We further employ a general channel-crossing attention mechanism to enhance feature expression, complementing this spatial modeling and accelerating convergence. For feature aggregation, we first propose correlation-aware aggregation (CAA) to model correlations among features of the same lesions or anatomical structures. Second, we develop a dual-path adaptive feature aggregation (DAFA) module: the common path captures stable cross-layer semantics and suppresses redundancy, while the delta path emphasizes subtle differences to strengthen the model’s sensitivity to fine details. Finally, we introduce a residual-gated visual state space module (RG-VSS), which dynamically modulates information flow via a convolution-enhanced residual gating mechanism to refine fused representations. Experiments on diverse datasets demonstrate that our HCDAA-Net outperforms some state-of-the-art (SOTA) approaches.
Ren et al. (Thu,) studied this question.