What question did this study set out to answer?

The research aims to improve medical image segmentation by addressing challenges related to details and dependencies, and performance consistency across different devices.

May 30, 2026Open Access

CMFA-Net: A CNN–Mamba Collaborative Feature Alignment Network for Robust Medical Image Segmentation

Key Points

The research aims to improve medical image segmentation by addressing challenges related to details and dependencies, and performance consistency across different devices.
Proposed CMFA-Net combines CNN and Mamba modules for effective feature alignment.
Implemented a contrastive domain alignment learning strategy to enhance cross-dataset robustness.
Utilized an enhanced multi-scale context aggregation decoder for better feature integration.
Achieved competitive segmentation performance on the CirrMRI600+ MRI dataset and polyp segmentation benchmarks.
Ablation studies confirmed the effectiveness of the CMFA module, EMCAD decoder, and cDAL strategy.

Abstract

Medical image segmentation still faces three critical challenges: insufficient joint modeling of local details and long-range dependencies, the high computational burden of transformer-based architectures for high-resolution inputs, and performance degradation caused by domain shift across imaging centers and acquisition devices. To address these issues, this paper proposes CMFA-Net, a CNN–Mamba collaborative feature alignment network for robust medical image segmentation. The proposed framework adopts Vision Mamba (VSSM) as the encoder backbone to capture long-range contextual dependencies with linear computational complexity. A CNN–Mamba fusion attention (CMFA) module is designed to integrate the local representation capability of convolution with the long-range modeling capability of Mamba, improving the segmentation of complex boundaries and multi-scale targets. In addition, an enhanced multi-scale context aggregation decoder (EMCAD) is introduced to reduce the semantic gap between encoder and decoder features and strengthen hierarchical feature fusion. To enhance cross-dataset robustness, a contrastive domain alignment learning (cDAL) strategy is applied in the intermediate feature space to learn domain-invariant discriminative representations via an InfoNCE-based objective. Experiments on the CirrMRI600+ pathological liver MRI dataset and several public polyp segmentation benchmarks show that the proposed method achieves competitive segmentation performance. Ablation studies provide empirical evidence for the contributions of the CMFA module, EMCAD decoder, and cDAL mechanism under the same experimental protocol. These results suggest that CMFA-Net is a promising framework for medical image segmentation across heterogeneous datasets.

Read Full Paperexternally

Ask AI

Helpful

Bookmark

View Full Paper