Accurate segmentation of medical images is crucial for diagnosing and treating cardiovascular diseases in the elderly. However, these images often suffer from blurred boundaries and low contrast due to complex lesions such as calcification and plaque, challenging existing methods to simultaneously capture global context and preserve local details. To address this, we propose CTM-Net, a collaborative framework integrating convolutional neural networks (CNNs), transformers, and multilayer perceptrons (MLPs). The CNN encoder extracts hierarchical local features, a Transformer module at the bottleneck captures long-range dependencies, and a lightweight MLP-based decoder with a novel Spatial-Channel MLP (SC-MLP) block performs efficient upsampling and pixel-level classification. Multi-scale feature fusion is achieved via skip connections. Experiments on three public cardiovascular datasets (ASOCA, Cardiac-MRI, Sunnybrook) demonstrate that our method significantly outperforms mainstream models like U-Net, TransUNet, and nnUNet in key metrics (e.g., 82.1% Dice on ASOCA vs. 81.5% for nnUNet, p < 0.05), while maintaining superior computational efficiency. Systematic ablation studies validate the synergistic design. This framework offers a promising pathway for complex cardiovascular image segmentation.
Chen et al. (Fri,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: